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THE TRANSFER VALUE OF GIVEN AND INDIVIDUALLY 
DERIVED PRINCIPLES! 


G. M. HASLERUD AND SHIRLEY MEYERS 
University of New Hampshire 


While there is general recognition of 
the value of organized learning for 
memory and transfer, differences on how 
that organization should be taught and at- 
tained lead to quite different theories of 
transfer. On the one hand are those who 
decry outside direction of learning when 
one is interested in transfer. Katona (7) 
using card and geometic puzzles found 
that his memorization group was signifi- 
cantly poorer on invention and transfer to 
new problems than his “Help” group that 
had only examples. He concluded “... 
that formulating the general principle in 
words is not indispensable for achieving 
application,” (7, p. 89) but he was un- 
willing to say that learning of principles 
in words is always less efficient than by 
example. He put teaching the result as 
the worst method, teaching by stating the 
principle as intermediate, and teaching by 
example as best. However, Hendrix (5) 
found that with a mathematical principle 
those groups that discovered the prin- 
ciple independently and left it unverba- 
lized exceeded those who discovered and 
then verbalized, and both exceeded in 
transfer those who had the principle stated 
for them and then illustrated. 

Opposed to Katona and Hendrix are 
those like Craig who concluded: “The 
more guidance a learner receives, the more 


*This research was partially supported 
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efficient his discovery will be; the more 
efficient discovery is, the more learning 
and transfer will occur” (1, p. 72). In a 
further study with college groups and with 
the same method of having the S pick 
out that alternative among five which 
does not fit a principle (2) he verified 
that significantly more such problems were 
solved when the principle was stated above 
it than when the S was given only the 
instruction that one of the five items 
did not belong. One should note, however, 
that he found no difference between his 
groups on transfer to new principles nor 
was there any difference in retention after 
3 or 17 days, although at 31 days the dif- 
ference favored the directed group. 

While Craig’s experimental results ac- 
tually gave little or no support to his 
claim that guidance is desirable for trans- 
fer, more serious opposition to Hendrix 
and Katona came from Kittell (8). He 
found that “intermediate direction” (start- 
ing a principle) was significantly superior 
to both the “minimal direction” (told 
only that one of five alternatives would 
not fit) and “maximal direction” (E£ 
told the principle and worked out the 
answer for S), with minimal direction 
definitely the inferior method. His sub- 
jects were sixth graders while Craig’s 
were college students. That difference in 
age and educational level may explain the 
contrast in results. Also Kittell’s low num- 
ber of successful solutions (means only 
4.59 for intermediate direction and 1.93 
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for minimal direction out of 15 principles) 
suggests that the problems based on lin- 
guistic arrangements and meanings may 
have been too difficult for the Ss. If that 
were the case, then following directions 
in the stated principle was about the only 
way to solve the problem when unpro- 
vided with sufficient apperceptive mass 
and experience. Haslerud (4) found that 
while naive rats transferred anticipatively 
from forced turns near the goal into 
prior free units of a maze just as well 
as when those goal turnings had been 
established by trial and error, only active 
trial and error cul-de-sac elimination in 
the goal region could readjust an estab- 
lished pattern in the prior free units. If 
a similar limitation on effectiveness of 
guidance is present in human Ss, then one 
might expect any advantage primarily in 
young Ss and that mainly on their initial 
learning but none for memory and trans- 
fer where Ss have sufficient background to 
derive a solution themselves. 

While the Katona and Hendrix con- 
cept of how to get maximal transfer seems 
to have face validity, at least for adults, 
their controls and statistical supports are 
unsatisfactory. When one draws his con- 
clusions on the basis of one principle, e.g., 
the sum of the first n numbers, a question 
remains of how much ofthe conclusion is 
a function of the particular problem used 
or the selection of individuals for the vari- 
ous groups. More convincing differentia- 
tion of principle given from principle 
derived would seem to require homogene- 
ously varied problems posed in quantity 
to the same individuals. A likely material 
has been found in an extension of the 
familiar cryptogram “Come to London” 
in the Stanford-Binet. An unpublished 
pilot study by the junior author under 
the senior author’s supervision indicated 
an advantage for memory of the inde- 
pendent solving of such coding principles. 
The present study extends a similar 
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method to transfer. The hypothesis tested 
is that principles derived by the learner 
solely from concrete instances will be more 
readily used in a new situation than those 
given to him in the form of a statement of 
principle and an instance. 


PROCEDURE 


Subjects for the experimental group 
were 76 members, ranging from freshmen 
to seniors, of two general psychology 
classes at the University of New Hamp- 
shire. The control group of 24 students 
in another psychology class ranged from 
sophomores to seniors. 

The experimental groups were each 
given two coding tests, the second being 
administered one week after the first. The 
control group was given only the second 
test. All tests were administered by the 
senior author. 

The first test composed of 20 coding 
problems was designed to give the students 
two types of experience: (a) problem 
solving with specific directions for de- 
ciphering the code printed above each 
problem, and (6) problem solving with no 
directions given. The first part of each 
problem was the four-word sentence “They 
need more time,” followed by the same 
sentence in code. A different code was 
used in each problem. The second part of 
each problem was the four-word sentence 
“Give them five more,” which the Ss were 
asked to translate into the code for that 
problem. The given and derived problems 
were alternated so that the S would solve 
approximately equal numbers of each 
kind. As a control for differences between 
the codes, there were two test forms, A 
and B. The same codes were used in both, 
but those for which directions were given 
in form A had to be deciphered by the S 
in form B, and vice-versa. The problems 
were arranged in approximately the ap- 
parent order of difficulty. Examples of 
moderately easy coding rules are: “For 
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each letter of the sentence write the letter 
that follows it in the alphabet.” “Write 
the first two letters of each word and then 
the last two letters of each word.” To in- 
troduce the test the senior author told the 
Ss that the test was an experiment in 
cryptography. He wrote an illustrative 
code on the blackboard and purposely 
worked it out partly incorrectly to en- 
courage remonstrances from the group 
that a system or principle was possible. 
Ss were asked to solve the problems in the 
order they appeared on the test and to do 
as many as they could in the time allotted. 
Since the 45 minutes allotted was ample 
time for all but one or two students in 
each group, the test was essentially a 
power rather than a speed test. The Ss 
were not told that they would be retested 
on the same material. 

The second test printed only in one 
form was given to both the experimental 
and control groups. Again, the 20 codes 
used in the first test were used, but in- 
stead of the common sentence of Test 1, 
there were 20 different English sentences 
14 to 18 letters in length followed by four 
translations into code. Only one translation 
was correct, and the Ss were asked to 
check it. They were told that the other 
three were simply letters arranged in ran- 
dom order. They were not told that num- 
bers had been assigned to letters of the 
alphabet and that letters for two of the 
four codes had been selected according 
to the order in which those numbers ap- 
peared in a list of random numbers. The 
third false code was composed of letters 
of the English sentence arranged according 
to random numbers. The order in which 
the four codes followed the sentence and 
the order in which the problems were 
arranged on the test were also random. 
No mention was made of the previous test, 
nor was the purpose of the test told until 
both tests had been given and the results 
compiled. 


Resv.ts 

The data for each individual in the ex- 
perimental group consisted of four scores: 
(a) Number of correct codings on Test 1 
problems where the rule was given, here- 
after called G, scores. (b) Number of cor- 
rect codings on Test 1 problems where the 
coding principle had to be derived by the 
S, hereafter called D, scores. (c) Correct 
alternatives for those codes in Test 2 that 
had been G type in Test 1. (d) Correct 
alternatives for those codes in Test 2 that 
had been D type in Test 1. In the control 
group the score was the total number of 
correct alternatives on Test 2. Any coding 
was considered correct if no more than 1 
of the 16 letters was wrong, since careless- 
ness rather than lack of understanding of 
the principle was probably responsible for 
the lone error. 

Since there was no difference between 
their results, the two experimental classes 
were combined. The analysis of results, 
however, was carried through separately 
for Forms A and B of Test 1 because a 
difference significant at the .05 level indi- 
cated that the 10 odd and the 10 even 
problems had not been exactly equated for 
difficulty. Nevertheless, the direction of 
results for both A and B groups showed 
equally high differentiation of G and D 
situations. 

Test 2 performance of the experimental 
group was significantly different from that 
of the control group. The means, 15.74 and 
10.75 respectively, differ beyond the .001 
level. Apparently something is transferred 
from the Test 1 experience. 

The crucial comparisons are between the 
G and D kinds of problems. For both 
Forms A and B on Test 1, significantly 
more G problems were correctly coded: 
8.86 and 8.36 against 5.86 and 4.88 for G 
and D respectively. The results for Test 2 
a week later are given in Table 1. If the 
differences are added algebraically to the 
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TABLE 1 
DIFFERENCE IN NUMBER OF PROBLEMS Suc- 
CESSFULLY CopED BETWEEN THE TRANSFER 
Test (Test 2) AND THE INITIAL LEARNING 
(Test 1) wits Eacu Inpivipvat as His 
Own ConrtTROL 
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Test 1 scores given in the previous sen- 
tence, one obtains the nearly equal trans- 
fer scores of Craig’s experiment (2). But 
since each individual was his own control 
for both G and D problems on Tests 1 and 
2, it is legitimate to use the subtraction 
method to find the standard error of the 
difference for paired observations. The 
correct identification of those codes which 
had been D type on Test 1 increased 467% 
while those which had been G decreased 
10%. Both changes are significant, at the 
001 and .05 to 01 levels respectively. 
There is reason to think that both cur- 
tailing time to make Test 2 a speed test 


and increasing time to greater than a week . 


between the learning of the codes on Test 
1 and the transfer on Test 2 would ac- 
centuate the differences. 


Discussion 


This experiment has added strong sup- 
port to the contention of Katona and 
Hendrix that independently derived prin- 
ciples are more transferable than those 
where the principle is given to the student. 
Even though Ss produced more correct 
codings on the original learning when the 
principle was stated for them, on the “pay- 
off,” or “applying” to use Katona’s term, 
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the advantage definitely passed to those 
principles derived by the student himself. 
Fast and accurate learning or performance 
under immediate guidance is no guarantee 
of transfer to new problems without such 
support. From Craig’s and our experiments 
the conclusions just stated are supported 
by results on college students, but testing 
of grammar level students by principles of 
a more suitable level of difficulty than used 
by Kittell (8) might show a wider appli- 
cation. Our coding method could be easily 
adapted for that purpose. 

The obtained results of this experiment 
do not follow from inadequate controls. 
The alternate Forms A and B allowed 


‘ each principle to be given (G) and derived 


(D). Individual differences with respect 
to problem solving in the Ss were ruled 
out since each person responded to 10 G 
and 10 D problems on Test 1 and the 
follow-up of each of these on the transfer 
Test 2. The control ‘group’s much poorer 
performance on Test 2 indicated that a 
genuine transfer function was present. 
Making time on each test practically un- 
limited pushed the G and D types of pres- 
entation to their limit as power tests. 
Two possible weaknesses in the transfer 
Test 2 need to be examined. With four al- 
ternatives for each problem, a chance 
score would average 5. The control group 
had 10.75 problems correct; this showed 
good adaptation to the test but signifi- 
cantly less than the 15.74 of the experi- 
mental groups. The question whether the 
better performance of the experimental 
group was just the result of a second ses- 
sion of practice on coding problems can 
probably be answered by reference to the 
study by Warren (9). He found that 
adults on letter-symbol substitution rap- 
idly attain a plateau on transfer prob- 
lems because of “learning sets” from early 
childhood. Coding is in that clase of simple 
activities for adults where experience and 
practice as such make little difference after 
the first 10 minutes. Even if one took the 
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maximum change of 37% during Warren’s 
16 five-minute periods, it would be less 
than the nearly 50% advantage of our ex- 
perimental group over the control group. 
The second possible weakness arises from 
the randomized construction of the false 
alternatives of the transfer test. It is con- 
ceded that a person might try to solve the 
problems by excluding the three alterna- 
tives because of their random characteris- 
tics rather than by trying to recognize 
and verify some consistent principle in the 
one true alternative. However, the prin- 
ciples must have played a significant role 
in the solutions because without them the 
results of the control and experimental 
groups would have been equal since they 
had the same instructions and equal op- 
portunity to use this abortive device. 

The theories of transfer found in current 
educational psychologies are inadequate 
to explain the present experinent. The 
senior author plans to develop in another 
place a theory that transfer is fundamen- 
tally an anticipative rather than a per- 
severative function and that to get trans- 
fer one must always counteract the finality 
of a goal (3). A stated principle to some 
extent, and even more Kittell’s “maximum 
guidance” of E doing the problems for S 
after giving him the principle, practically 
stops transfer, like other goals. Hendrix 
(6) states from Thorndike that only 5% 
of high school students have language 
ability sufficient to receive a ready-made 
sentence and find readily illustrations in 
their own background to provide the pre- 
requisite to meaning. If the results of the 
present experiment can be verified for a 
wider range of ages and apperceptive 
masses, then. the implications for a direct 
attempt to teach for transferable princi- 
ples can not be neglected. 


SuMMARY 


The educationally important question of 
how much guidance is desirable if one is 
interested in transfer was tested experi- 
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mentally by a new use of coding. Each of 
76 college students as his own control 
translated into 20 different codes a com- 
mon four-word sentence, with the rule 
given for half of the problems and re- 
quired to be derived solely from example 
for the other half. As in previous studies 
on initial learning, the Ss did significantly 
better on those problems with the rule 
given. However, a week later on a mul- 
tiple-choice transfer test consisting of 20 
different sentences, one for each of the 
20 coding principles of the first test, the 
selection of the adequate code from three 
specious ones made by randomizing letters 
gave very different results. The scores 
were significantly increased for those 
problems which had formerly been de- 
rived as contrasted with a significant de- 
crease for those problems where the rule 
had formerly been given. A control group 
of 24 college students given only the 
second test proved by significantly poorer 
performance than the experimental group 
the value of transfer from the first test. 
The results give strong support to the 
postulate of Hendrix that independently 
derived principles are more transferable 
than those given. The apparent contra- 
diction with Kittell’s study of children 
was explained by the smaller apperceptive 
mass in the child, and the prediction was 
hazarded that as naivety is lost, the prob- 
ability of transfer from learning which is 
minimally directed is increased. 
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OCCUPATIONAL LEVEL AND THE PRIMARY MENTAL ABILITIES! 


K. WARNER SCHAIE 
University of Nebraska 


Thurstone’s S.R.A. Primary Mental 
Abilities test (PMA) is frequently used as 
the intelligence test component of a battery 
given for guidance purposes. A reason for 
this use is the common inference that a 
study of the separate and presumably 
independent scores for the different abil- 
ities will yield clues to predict future suc- 
cess in certain school courses and voca- 
tions. Since intellectual functioning is 
generally found to be an important de- 
terminant in predicting successful per- 
formance it would be of interest to vali- 
date assertions that one could go a step 
further and predict differential success 
for a given type of vocational choice. 

A scrutiny of the PMA literature sug- 
gests that validation studies have been 
concerned primarily with the correlation 
of the Primary Mental Abilities with a 
variety of achievement tests. Examples 
of such studies are reported in the test 
manual for the relation of the PMAs 
with the Stanford Achievement test (5) 
and with the Iowa Tests of Educational 
Development (4). Other work has related 
the PMAs to the United States Employ- 
ment Service General Aptitude tests (2). 
These studies, done primarily with high 
school populations, conclude that the 
PMAs are fairly good predictors of cur- 
rent achievement and are useful for guid- 
ance purposes. 

None of these studies provide any in- 
formation, however, upon success in pre- 
dicting actual occupational choice. The 


*The data for this study were collected 
as part of the author’s Ph.D. dissertation 
while a graduate student at University of 
Washington, Seattle. Financial assistance 
for processing the data was given by Science 
Research Associates, Chicago, and is grate- 
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present inquiry attempts to fill this gap 
indirectly by examining a group of adult 
individuals who have made a firm occupa- 
tional choice to see whether they could 
be differentiated in terms of their PMA 
scores as to the type of occupation se- 
lected. 


HypoTHEsEs 


The PMA manual was consulted to see 
what kind of predictions were suggested 
by the test authors and others in relating 
performance on the PMAs to activities 
required in various occupations. One pur- 
pose of the PMA profile, for example, 
is its use for estimating the individual’s 
general level of intelligence. Young people 
planning to go to college are presumed 
to require above average standing on most 
of the abilities, but particularly on Ver- 
bal-meaning (V) and Reasoning (R). 
People whose occupational choice results 
in professional types of activity would 
therefore be expected to show high per- 
formance on all abilities but should show 
particular elevation on V and R. Space 
ability (S) is presumed to be important 
for occupations like electrician, machinist, 
engineer or carpenter. Skilled laborers, 
should therefore be found to be high on 
S. Accountants, cashiers, bank tellers, 
sales clerks and the like are supposed to 
be favored by good arithmetic ability and 
should thus be high on Number ability 
(N). People who run their own business 
or belong to the managerial category 
would be expected to have an education 
and skills somewhere in between the pro- 
fessional and clerical groups and would 
thus be expected to be high on some 
attributes common to both. 
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METHOD 


A great many personality and other 
variables are involved in determining spe- 
cific job choice and their control would 
be extremely difficult. It was therefore 
decided to concentrate upon a more gen- 
eral differentiation into 10 major occupa- 
tional headings as used in the reports of 
the United States Census Bureau. Since 
the counseling use of the PMA is usually 
at the high school level, four of these 
major occupational classifications were 
selected for study as they are probably 
the most important ones being considered 
in a great majority of cases. These are: 
(a) professional and semiprofessional, (b) 
managerial and proprietary, (c) sales and 
clerical, and (d) skilled labor. In order 
to avoid artifacts introduced by transient 
or enforced job choice or possible PMA 
sex differences, only male Ss were used. 
Since we are interested in stable occupa- 
tional choice no S was to be included if 
he reported a change in his occupation 
or job specification over the past five 
years. 

As part of another investigation, data 
were available on the PMA scores and 
the occupational status of a sample of 
500 adult Ss (3). Since age changes on 
the PMAs over the adult age range are 
known to be substantial, these were con- 
trolled experimentally by matching for 
age over the four occupational levels. 
From a pool of 172 Ss who met the 
initial criteria for inclusion it was thus 
possible to match 20 sets of Ss, or a total 
of 80 Ss. These ranged in age from 26 to 
65 years with a mean age of 45.5 years. 

The S.R.A. Primary Mental Abilities 
test, intermediate form, was given to each 
S and was administered in group sessions 
using the instructions given in the ex- 
aminer’s manual. All raw scores were 
converted to standard scores with means 
of 50 and standard deviations of 10 by 
use of the norms available for the total 
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sample of 500 adult Ss. The reported 
mean scores are therefore directly com- 
parable for the different mental abilities. 


RESULTS 


The first step in the analysis of the 
test data was to compute means and 
standard deviations which are given in 
Table 1. The analysis of variance was 
then employed for a formal test of the 
null hypothesis with respect to over-all 
differences between the different PMAs 
and between occupational levels. The anal- 
ysis for the total sample is presented 
in Table 2 and uses methods suggested 
by Edwards (1). The test for the differ- 
ence between occupational levels is based 
on independent observations and thus uses 
within level variance as its error term. 
The test for the over-all differences be- 
tween PMAs however, requires adjust- 
ment for correlation between the mental 
abilities. The pooled interaction of indi- 
viduals and PMAs is therefore the correct 
error term for this test. 

Inspection of Table 2 shows that F 
ratios for the variance associated with 
differences among PMAs as well as be- 
tween occupational levels were found to 
be significant at the .001 level of con- 
fidence and the null hypothesis was there- 
fore rejected. The interaction between 
PMAs and occupational levels, however, 
was not significant. These findings suggest 
that there are significant differences in 
over-all intellectual level between the 
different occupational groups as well as 
significant differences between scores on 
different abilities for most individuals. 
The lack of systematic interaction, how- 
ever, indicates that specific PMA profile 
patterns are not a function of occupa- 
tional level. It appears then that profile 
elevation, i.e. level of intelligence as es- 
timated by the total PMA test, rather 
than profile pattern should be considered 
as the significant variable for predicting 
future occupational level. 





OCCUPATION AND PRIMARY MENTAL ABILITIES 


‘ 7. 
TABLE 1 
MEANS AND STANDARD DEVIATIONS ON THE Primary MENTAL ABILITIES 
FOR DIFFERENT OCCUPATIONAL LEVELS 


Skilled 
labor 


(N = 20 in each level) 


Professional 
& semi-prof. 


Clerical 
& sales 


Managerial 
& propr. 





44 


Verbal-meaning 0 
Space 51.3 
Reasoning 44.2 
Number 48.3 
Word-fluency 44.1 


50.9 . 55. 
54. 
49. 
55.: 
47. 





TABLE 2 
ANALYSIS OF VARIANCE FOR THE COMBINED SAMPLE TESTING THE NuLL Hypotuesis WITH 
Respect To DirreRENcES BETWEEN Primary MENTAL ABILITIES AND 
BETWEEN LEVELS OF OccCUPATION 


(N = 80; 5 scores 


Source of variance 


Sum of squares 


for each S) 


Mean square F ratio 


df 





Between levels 
Between individuals in level 


4,249.04 
15,877.56 


20, 126.60 


Total between 


1,912.40 
575.56 
15,688.44 


18, 176.40 


Between PMAs 

Interaction: levels X PMAs 

Pooled interaction: individ- 
uals X PMAs 

Total within 


Total variance 38,302.00 


1,416.35 6.78* 


208 . 92 





* Significant at or beyond the .001 level of confidence. 


The above analysis does not rule out 
the possibility that a given mental ability 
will tend to discriminate between dif- 
ferent occupational levels while others do 
not. It is also possible that differences 
between Mental Abilities occur only in 
certain but not all of the occupational 
levels studied. To clarify these problems 
further analyses of variance were made 
for each separate occupational level and 
also for each of the different Mental Abil- 
ities. 

The results shown in Table 3 indicate 
that there are indeed significant differ- 
ences between the different PMA mean 
scores within both the “skilled labor” and 
“managerial” levels. Referring back to 


Table 1 it may be seen that for the 
“skilled labor” level Space is high, while 
low performance is found on all the ver- 
bal skills (V, R, and W). In the “man- 
agerial” group high scores are found to 
be Space and Number while this group 
is also low on V, R, and W. It is worthy 
of note that these patterns obviously 
overlap, explaining why the interaction 
between occupational level and abilities 
cannot be significant and why profile el- 
evation turns out to be the significant 
discriminator. 

Table 4 gives the results of the anal- 
ysis of variance for the Primary Mental 
Abilities with respect to differences among 
occupational levels on each separate 
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TABLE 3 


ANALYSIS OF VARIANCE FOR THE DIFFER 

ENCES BETWEEN PRIMARY MENTAL ABILI- 

TIES IN Each SEPARATE OCCUPATIONAL 

LEVEL, ADJUSTED FOR THE EFFECT OF 
CORRELATION WITHIN INDIVIDUALS 

(N = 20; 5 scores for each individual) 


Within 
individuals 


Between | 
abilities Resid- 
| ual 
ame =a . | error 
MS F MS | F | 


| 
© Ronco 
| 


Skilled  la- (222.31 6.59*|284 .22'7.52* 33.79 
bor | i 
Sales & (142.69 2.40 |181.063.05* 59.43 
clerical 
Managerial 
Profes- 
sional 


| | | | 
207 .03 3.77*/206 . 58/3 .67* 56.28 
| 72.45 1.16 163.81 2.63° 62.36 
| | 


| | 





* Significant at or above the .01 level of confidence. 
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Another interesting analysis can be 
made by inspecting the standard devia- 
tions presented in Table 1. Since the pop- 
ulation standard deviation has arbitrarily 
been assigned to be 10, the standard’ de- 
viation for any subgroup would be ex- 
pected to be significantly lower on any 
variable which tends to discriminate the 
subgroup from the total group. A low 
standard deviation would thus indicate 
that this is a variable on which the sub- 
group is more homogenous than the gen- 
eral population. Such increased homoa- 
geneity was found for the professional 
group on Verbal-meaning and for the 
managerial group on Word-fluency. In- 
spection of the range of standard devia- 
tions among the occupational levels gives 


TABLE 4 
ANALYSIS OF VARIANCE FOR THE DIFFERENCES BETWEEN OccUPATIONAL LEVELS ON Eacu 
SEPARATE Primary MENTAL ABILITY ADJUSTED FOR THE EFFrEcT oF 
CorRRELATION Dug TO MATCHING FoR AGE oF Ss 


( N= 


Between occupational 


levels 


MS 


604.45 
108.41 


Verbal-meaning 
Space 


11.99* 


) 


Residual 
error 


Between matched 
individuals 


F MS F 


98.91 
112.80 


119.18 
95.97 
100.09 


4.57* 
3.94 
7.45* 


277.15 
318.18 
500.25 


Reasoning 
Number 
Word-fluency 





* Significant at or beyond the .01 level of confidence. Trivial F ratios are omitted. 


ability. Verbal-meaning, Reasoning, and 
Word-fluency are found to differ signifi- 
cantly between occupational levels but 
Space and Number apparently fail to 
discriminate. Examination of the appro- 
priate means shows high performance on 
Verbal-meaning and Reasoning for the 
professional group, low performance for 
the skilled laborers, and about equal and 
intermediate performance for the mana- 
gerial and clerical groups. On Word-flu- 
ency the clerical and professional groups 
are about equal and high, while the man- 
agerial and skilled labor groups are low. 


further indications why some of the abil- 
ities fail to discriminate between levels. 


SuMMARY 


Scores on the intermediate form of the 
S.R.A. Primary Mental Abilities test were 
examined for a stratified sample of male 
Ss from four occupational levels to test 
the hypothesis that differential perform- 
ance on this test is useful in predicting 
future occupational placement. Several 
hypotheses frequently used in counseling 
on the basis of the PMA are presented 
and relevant evidence concerning the 
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PMA patterns of adults who have made 
permanent occupational choices is given. 

The results of an analysis of variance 
yielded significant differences between the 
over-all ability for different occupational 
levels and also between different abilities. 
The interaction between occupational level 
and individual mental abilities, however, 
was not significant. 

Significant differences were also found 
between abilities within the “skilled labor” 
and “managerial” groups. Analysis of the 
individual mental abilities showed signif- 
icant differences between the mean scores 
for different occupational groups on Ver- 
bal-meaning, Reasoning, and Word-flu- 
ency. 

It should be pointed out that the pres- 
ent study was concerned only with oc- 
cupational levels. Pattern analysis of the 
PMA might therefore still be helpful for 
predicting success in a specific occupation. 
On the basis of the present findings, how- 
ever it must be concluded that profile 
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elevation (or general intellectual level) 
is of greater importance than profile pat- 
tern in predicting vocational choice. 
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EFFECT OF INSTRUCTIONS ON FREE ASSOCIATION 


ROSCOE A. BOYER 
University of Mississippi 


CHARLES F. ELTON 
University of Kentucky 


Although many investigators (2, 3, 4, 
8, 9, 10) have demonstrated repeatedly 
that the counselor or test examiner may 
either deliberately or inadvertently struc- 
ture the stimulus field, a review of the 
literature indicates that minimal research 
has been done regarding some of the char- 
acteristics of such influence. Only recently 
did Bordin (1) suggest the theoretical 
implications of the ambiguity structured- 
ness variable in the counseling process. 

It was the purpose of this study to in- 
vestigate temporal effects and the verbal 
responses per se resulting from structuring 
the instructions regarding what would be 
appropriate responses using the free as- 
sociation technique. According to Bordin, 
if a group of “minimally anxious” Ss were 
used, it could be deduced from his theo- 
retical approach that the suggestions of 
appropriate responses to some words for 
these subjects would not influence the 
responses made to subsequent words. This 
study attempts to investigate this hy- 
pothesis. 

A secondary purpose was to determine 
if there were regional differences occurring 
in the free associating technique. 


PROCEDURE 


The Ss were 401 college students at- 
tending the University of Mississippi dur- 
ing 1955-56 school year. Of these, 120 
were enrolled in sophomore and junior 
year education courses, 130 in sophomore 
year psychology courses, and the remain- 
der in economics, engineering and under- 
graduate courses in statistics. No sex 
differentiation was made. The Ss were di- 


vided into three groups according to in- 
structions given the students on the Kent- 
Rosanoff Free Association Test (6). An 
attempt was made to have equal represen- ° 
tation of students from the various type 
classes in each of the three groups. 

Group I. The instructions given by 
Russell and Jenkins (7) for administering 
the Kent-Rosanoff Free Association Test 
were used and are as follows with the ex- 
ception that Mississippi was substituted 
for Minnesota: 


This is one of the studies in verbal be- 
havior being done at Mississippi. This par- 
ticular experiment is on free association. 

Please write your name on the outside of 
the paper passed to you. You can ignore the 
place for your name on the other side. 

When you open these sheets, you will 
see a list of 100 stimulus words. After each 
word write the first word that it makes you 
think of. Start with the first word; look 
at it; write the word it makes you think of; 
then go on to the next word. 

Use only a single word for each response. 

Do not skip any words. 

Work rapidly until you have finished 
all 100 words. 

When you are through, turn your paper 
over and write on the back the letter that 
appears on the board at that time. 

Are there any questions? 

Ready. Go. 


The following additional section ap- 
peared after their third paragraph: 


For Example, Your Responses to the First 
Words Might be as Follows: 


No. Stimulus 


Table 
Dark 
Music 
Sickness 
Man 


Response 
Chair 
Light 
Song 
Health 
Woman 
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These example response words were those 
listed by Russell and Jenkins as the most 
common responses to the respective stimu- 
lus words. This group consisted of 133 
students and will be referred to hereafter 
as the positively structured group. 

Group II. The instructions to this 
group were the same as those for Group I 
except that the given “response word” 
examples had a frequency of 10 in 1031 
samples as listed by Russell and Jenkins; 
therefore, these were considered uncom- 
mon or atypical responses to the respec- 
tive stimulus words. 

These response words were: 

No. Response 
Eat 
White 
Dance 
Bad 
Mouse 


Stimulus 
Table 
Dark 
Music 
Sickness 
Man 
This group consisted of 133 students and 
will be referred to hereafter as the nega- 
tively structured group. 

Group III. The instructions given to 
this group were identical to those given 
by Russell and Jenkins; i.e., no response 
example was given. This group served as 
the control group and consisted of 135 
students. 

The above instruction for the three re- 
spective groups appeared on the first page 
of a three-page mimeographed test book- 
let. The second and third pages followed 
the form given by Russell and Jenkins 
(7). 

In the present study the same procedure 
as that described by Russell and Jenkins 
was followed (7). After the students had 
been working on the test for four minutes, 
the letter A was printed on the black- 
board. Every 30 seconds thereafter a new 
letter was substituted in alphabetical se- 
quence. When the students had completed 
the test, they recorded on the back of 
the test booklet the letter appearing on 
the board at that time. Consequently a 
rough index of time necessary for each 
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student to complete the test could then 
be obtained. 

Following the administration of the 
tests, response frequencies were tabulated 
for each stimulus word for each of the 
three groups. The frequencies were then 
converted into percentages and tests of 
significances were made according to the 
procedure suggested by Lawshe and Baker 
(5). 

A difference in percentage ratio was 
used to investigate the influence of the 
suggested answers. The formula used was 
(E, — C)/C, in which E, was the per- 
centage of responses in the experimental 
group and C was the percentage of re- 
sponses in the control group. The value 
of this ratio lies ‘n offering a convenient 
way of showing how the values of the ex- 
perimental groups converged with that of 
the control group. 

Throughout this article, references to 
the expression, “most common response 
word,” pertain to the word that was listed 
by Russell and Jenkins as the word having 
the highest response frequency for each 
of the 100 stimulus words. Unless other- 
wise indicated, all statistical analyses in 
this article are based upon the frequencies 
of these 100 most common response words. 
No attempt was made to evaluate dif- 
ferences of other response words. 


Resvu_ts aNp Discussion 


Temporal effects. An analysis of vari- 
ance was computed for the length of time 
required by the three groups to complete 
the tests. The F ratio was found to be 
11.14 which was significant beyond the 
one per cent level. Consequently, these 
data were then examined for t values and 
the means and standard deviations are 
given in Table 1. 

It was revealed that Groups I and II 
did not vary significantly from each other, 
but the differences were significant at the 
one per cent level between Groups I and 
III and between Groups II and III in 
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TABLE 1 
Time IN SEcoNDsS REQUIRED 
To ComMPLETE TEST 





SD 





Group N Mean 





139 
121 
121 


I 
II 
III 


536 
511 
473 





responses given to first 10 words. After all 
response words to the respective stimulus 
words had been tabulated, the frequencies 
were converted into percentages. In order 
to examine the influence of suggestion, a 
difference in percentage ratio was used 
and the data for the first 10 stimulus 
words are given in Table 2. 


TABLE 2 


PERCENTAGE OF Ss 1nN Eacu Group Givinc Most Common Response Worp To Eacu 
OF THE First 10 Stimutus Worps AND THE DirreRENCE Ratio BETWEEN 


UNIVERSITY OF Mississiprp1 Groups 








Percentage of group responses 


Stimulus Response 


Is 


Difference ratio 


Ir Minn,* 


Ill 





Chair 
Light 
Song/s 
Health 
Woman/en 
Shallow 
Hard 

Food 
Hill/s 
Home 


. Table 
Dark 
Music 
Sickness 
Man 
Deep 
Soft 

. Eating 

. Mountain 
. House 


1 
2. 
3. 
4. 
5. 
6. 
7. 
8 
9 
0 


1 


BRELSSIeSe 


65 
.30 
27 
-20 
19 
‘07 
.02 
31 





* N for group I = 133. 

> N for group II = 133. 

© N for group III = 135. 

4 Percentages are based on Russell and Jenkins data. 


time required to complete the tests. These 
data indicate that with the type of sug- 
gestion given to Groups I and II on a free 
association test, whether it be a common 
or uncommon answer, the time required 
to respond will be increased. It is to be 
noted that the most common or normal 
suggestions to a “normal” population re- 
sulted in the longest response time. The 
authors have no suggestion as to why 
this behavior occurred but are now in the 
process of attempting to duplicate this 
behavior with a different population and 
testing the hypothesis that differences in 
response time will disappear when certain 
variables are controlled. 

Detailed examination of most common 


It should be recalled that differences 
are based upon but one word in each re- 
sponse set; that is, the most common re- 
sponse word as given by Russell and 
Jenkins. These data in Table 2 revealed 
that by the time the students reached 
the sixth stimulus word, for all practical 
purposes, the influence of the suggested 
words had been dissipated. This trend was 
more consistent and stable in the nega- 
tively structured group (Group II) than 
in the positively structured group (Group 
I). The difference in response frequency for 
the most common response words between 
Groups I and II and between Groups I 
and III was significant at the one per cent 
level for the first four stimulus words. 
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Also, there was found a significant dif- 
ference at the one per cent level between 
Groups II and III for responses given to 
stimulus words [1] Table and [2] Dark. 
A possible explanation of the lack of sig- 
nificance between Groups II and III in 
response frequency to stimulus words [3] 
Musie and [4] Sickness is that the habit 
strength for the most common response 
might be considered relatively weak. Ac- 
cording to Russell and Jenkins the most 
common response for the stimulus word 
Musie occurs with a frequency of 18 per 
cent and the frequency for Sickness is 38 
per cent (7). The fact that percentage 
differences are significant between Groups 
I and II and Groups I and III for these 
same words may be a function of the sug- 
gestions given in the instructions and/or 
the lack of a strong competing response 
habit strength. The authors are now in- 
vestigating this possibility by ranking the 
words in the Kent-Rosanoff word list ac- 
cording to the response strength of each 
stimulus word and repeating this study. 
The explanation for the significant dif- 
ference in percentage response for the 
stimulus word [5] Man between Groups I 
and II and the absence of a significant dif- 
ference between Groups I and III and 
Groups II and III is more difficult. It 
may be that negative suggestion dissipates 
more rapidly than positive suggestion and 
this could have been operating to pro- 
duce such an effect. More likely is the 
fact that rate of dissipation is confounded 
with the problem of unequal response 
habit strengths among first five words. 
Examination of the remaining 90 most 
common response words. As indicated 
above, the instructions did not influence 
the response beyond the fifth word. The 
only significant differences that were 
found between any two of the three groups 
were for the stimulus words [20] Chair, 
[23] Woman, [59] Health, and [88] Heavy. 
Although these differences were significant 
at the one per cent level, an a priori ex- 


planation would be that they have oc- 
curred by chance and were due neither 
to the instructions nor to the samples that 
were used. 

Regional differences. Regional differ- 
ences were examined by comparing the 
response frequencies of college students 
at the University of Minnesota with the 
response frequencies made by the control 
group students at the University of Mis- 
sissippi. Only one response word was 
found to be significantly different: i.e., 
for stimulus [24] Cold, at the University 
of Minnesota, 34 per cent gave the re- 
sponse Hot, whereas 60 per cent of the 
college students at the University of Mis- 
sissippi gave that response. 


SUMMARY 


Four hundred and one undergraduate 
college students were divided into three 
groups for administration of the Kent- 
Rosanoff Word Association Test, under 
the following conditions: The first group 
of students was given five examples of 
common responses to the stimulus words; 
a second group of students was given five 
examples of uncommon responses or re- 
sponses occurring approximately one per 
cent of the time; and a third group of 
students was given no example. The re- 
sulting response frequencies were com- 
pared. There was no apparent difference 
in responses after the sixth word among 
those students who were given common or 
“normal” response examples, those given 
atypical responses, and those given no 
example of responses to the stimulus 
words. No differences were found be- 
tween responses given by college students 
at the University of Minnesota and col- 
lege students at the University of Mis- 
sissippi. This research suggests that the 
influence of any response instructions 
given to a normal population on a free 
association word test will be rapidly dissi- 
pated. 
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TRAITS OF RESEARCH SCIENTISTS 
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The extent of public support for scien- 
tific research and education is dependent 
upon the attitudes toward science and 
scientists which prevail in the culture. The 
development of these attitudes begins 
early in the elementary school (2). These 
attitudes are solidified by the time stu- 
dents reach the secondary school level 
(6, 8) where they influence the choice of 
a future career (7). The attitude of the 
public toward the current Man-Into-Space 
program is at least partially influenced by 
a general attitude of both respect for and 
a fear of the influence of scientific ad- 
vances upon our society. Stated somewhat 
differently, this ambivalent feeling toward 
the research scientist means that he is 
simultaneously a “different” and perhaps 
slightly dangerous individual, but also a 
necessary and even useful member of our 
society. These attitudes would appear to 
be particularly significant as far as ele- 
mentary and secondary school teachers are 
concerned since they are in daily contact 
with potential future scientists during the 
period when these attitudes are develop- 
ing. 

An indirect and somewhat disguised ap- 
proach to Ss’ attitudes toward scientists 
may be through a study of the consistency 
with which Ss attribute a syndrome of per- 
sonality characteristics to the average 
scientist. It is clear that stereotypes of the 
personality traits of people in various oc- 
cupations do exist among college students 
(1, 9) and identification of the specific 
traits that Ss believe distinguish the sci- 
entist; from people in other occupations 
may provide an insight into their atti- 
tudes toward the scientist and permit the 
subsequent development of a relatively 
simple and objective assessment device 


to measure these attitudes. Terman’s (10) 
investigation of intellectual and interest 
differences among four occupational 
groups, scientists, engineers, lawyers, and 
businessmen, suggests that comparisons 
among the personality traits attributed 
to these occupations might provide evi- 
dence as to the students’ stereotype of the 
personality of the research scientist. 


PROCEDURE 


Traits. An original list of approximately 
100 personality trait names was compiled 
from several published sources (1, 8, 9). 
Subsequently a list of 60 traits was selected 
on the basis of two criteria: (a) 30 traits 
that appeared on an a priori basis to be 
socially desirable and 30 traits judged to 
be socially undesirable, and (6) traits 
were selected that appeared representative 
of many significant dimensions of behavior 
including work habits, intellectual char- 
acteristics, and both the social and non- 
social aspects of personality. These cri- 
teria were employed (a) to minimize 
response bias in the subsequent ratings and 
to include a wide range of social desira- 
bility in the selected personality charac- 
teristics, and (6) to insure as far as pos- 
sible an adequate sampling of many 
different areas of behavior. The trait names 
finally selected were: accurate, calm, 
clumsy, fearful, considerate, meddlesome, 
intellectual, economical, democratic, in- 
ept, egotistical, cruel, logical, mature, un- 
systematic, pessimistic, friendly, sarcastic, 
studious, alert, kind, disorganized, timid, 
critical, orderly, responsible, incompetent, 
impulsive, tactful, annoying, precise, sin- 
cere, humorous, unimaginative, reckless, 
irritable, persistent, stable, sloppy, nerv- 
ous, sympathetic, shy, thorough, self- 
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confident, generous, unproductive, miserly, 
fault-finding, industrious, dependable, in- 
efficient, moody, tolerant, argumentative, 
capable, unreliable, rigid, poised, lonely, 
charming. 

Forms. Four occupational titles were 
selected (research scientist, engineer, law- 
yer, businessman), similar to the com- 
parison groups used by Terman (10), 
and six forms were prepared, one form for 
each of the possible combinations of two 
of the four occupations. On each form the 
S was requested to compare the average 
person in one (rated) occupation with 
the average person in another (reference) 
occupation and to make a judgment as to 
whether the first person would be most 
likely to have more, less, or an equal 
amount of each of the 60 traits than the 
person in the second (reference) occupa- 
tion. For example, the significant portions 
of the instructions for Form C were: 


We all know.that a person’s interests, 
abilities, attitudes, and personality charac- 
teristics determine to a large extent what 
occupation he or she will select. For ex- 
ample, the average research scientist has 
more or less of certain traits than does the 
average businessman, although on other 
traits these two people will have the same 
amount of these particular traits. We are 
asking you to identify which of these traits 
distinguish the research scientist from the 
businessman and which traits they have in 
common. 

Below is a list of 60 traits to be identi- 
fied.... Please indicate on your answer 
sheet your judgment for each of the traits 
using the following marking system: 

Column A: the average research scientist 
has more of this trait than the average 
businessman. 

Column B: both the average research 
scientist and the average businessman 
have about the same amount of this 
trait. 

Column C: the average research scientist 
has less of this trait than the average 
businessman. 


The combinations of occupations used on 
the forms are as follows: (a) Research 
Scientist vs. Engineer; (b) Research Sci- 
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entist vs. Lawyer; (c) Research Scientist 
vs. Businessman; (d) Engineer vs. Law- 
yer; (e) Engineer vs. Businessman, and 

(j) Lawyer vs. Businessman. 

A seventh form (Form G) was con- 
structed that requested the Ss to rate 
each of the 60 trait names on a five-point 
seale of social desirability. No reference 
was made on this form to specific occupa- 
tions, but the Ss were told that we were 
trying to obtain relative measures of the 
social desirability of a large number of 
personality traits. This last form was used 
as a check on our original dichotomization 
of the traits into socially desirable and 
undesirable groups. 

“Subjects. Form G was administered to 
54 Ss (18 men and 36 women) enrolled in 
two sections of introductory educational 
psychology. Forms A through F were ran- 
domly distributed to 154 Ss in four other 
sections of the same course, each S re- 
ceiving only one form. Sixteen Ss were 
disearded from this second group to in- 
sure that equal numbers of men and 
women Ss responded to each form. The 
disearding of Ss from each form-sex sub- 
group was random and the final group 
consisted of 138 Ss with 23 Ss (8 men 
and 15 women) recording their judgments 
on each of the six forms. The Ss were 
sophomore pre-education students who 
are required to take this course prior to 
admission to the School of Education. 


REsuLTS 


The mean social desirability rating of 
each trait by the 54 Ss who received Form 
G was computed and no overlap was found 
in mean ratings between the 30 traits that 
had been a priori selected as socially de- 
sirable and the 30 traits selected as socially 
undesirable. Consequently, the original 
grouping of the items into these two classes 
was retained in subsequent analyses. 

The answer sheets of the 138 Ss who 
responded to Forms A through F pro- 
vided four separate scores: the number 
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TABLE 1 
ANALYSES OF VARIANCE OF Four Personatity Trait Scores DisTiInGUISHING 


BETWEEN COMBINATIONS OF OCCUPATIONS 





Desirable- 


Source of 

Variation df . 
Mean 
Square 


Sex (Sx) 5.22) .29 
Forms (F) 

Sx X F 
Within 


*P> 001 


16.67 
18.03 


.92 


of desirable traits the rated occupation 
had more of (Desirable-More), the num- 
ber of desirable traits the rated occupation 
had less of (Desirable-Less), the number 
of undesirable traits the rated occupation 
had more of (Undesirable-More), and the 
number of undesirable traits the rated oc- 
cupation had less of (Undesirable-Less). 
Each of these four scores was then sepa- 
rately subjected to a two-criterion (sex 
and forms) analysis of variance. The re- 
sults are reported in Table 1. Both the 
Desirable-More and Desirable-Less scores 
discriminated among the six forms at the 
001 level of confidence, but neither the 
Undesirable-More nor the Undesirable- 
Less scores gave any evidence of signifi- 
cant differences among the forms. No 
statistically significant (.05 level) sex dif- 
ferences were found in any of the analy- 
ses. Apparently the Ss could consistently 
discriminate differences among the pairs 
of occupations with respect to the presence 
or absence of desirable traits shown by 
the average individual in the four occupa- 
tions, but did not discriminate among the 
occupations as to undesirable personality 
traits. This suggests that stereotypes con- 
cerning occupational personalities involve 
the presence or absence of socially de- 
sirable traits only. 

The mean number of desirable traits at- 
tributed to each pair of occupations can 
be found in Table 2. The difference score 


147.60] 8.19* | 138.06 9.87* 


Undesirable- 
Less 


Undesirable- 
More 


Desirable- 


Mean 
Square | 
36.27 | 
| 11.92 | 
24.53 
16.67 


38.74 2.77 


7.61, .54 
13.99 


TABLE 2 
Mean Numpers or DeEsIRABLE 
ATTRIBUTED TO Four OccUPATIONS IN 


Six Compinations (n = 23, N = 138) 


Desirable Traits 


| Dif- 
Less | fer- 
| ence 


Reference 


Occupation 
| Occupation 


Rated 
More 


A | Scientist 
| Scientist 
Scientist 


Engineer 9. 

Lawyer : 

Business- | 10.0 
man 

Lawyer 5.8 

Business- | 7.0 
man 

Business- 2.4) 
man 


Dy 
10.7 
6.9 


Engineer 
Engineer 


Lawyer 3.1 


’ 


between the “more” and “less” means can 
be viewed, in a sense, as a “favorability” 
score for the rated occupation when com- 
pared with the reference occupation. Both 
the sum of the “more” and “less” scores 
and the difference between these two 
scores were subjected to two-criterion (sex 
and forms) analyses of variance. The sums 
did not discriminate between the forms 
(F = 1.23) while the difference score 
showed form differences that were signifi- 
cant at the .001 level (F = 14.75). No 
sex differences were found in either analy- 
sis. Duncan’s method was used to test the 
significance of the differences among the 
difference score means (4, 5, pp. 26-29) 
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The First Occupation Is 


TABLE 3 
SInGLE Personauity Traits Most ConststEntTLy ATTRIBUTED 


To Stx CoMBINATIONS OF OccUPATIONS 





More | 


Less 


The Fi Sos 
Occupations irst Occupation Is 
Compared ——————————————————— 


More 
| 





Research  Sci- 


entist vs. En- 
gineer 


Persistent 
Studious 
Intellectual 
Alert 
Thorough 


Economical 


Poised 
Charming 
Self-confi- 

dent 
Tactful 
Alert 


| Precise 
Engineer vs. 


Lawyer 





Research Sci- 
entist vs. 
Lawyer 


Precise 





Charming 
Poised 
Humorous 
Self-confi- 
dent 
Friendly 


Tactful 
Humorous 
Poised 


Precise 
| Accurate 
| Studious 
| Thorough 


| Engineer vs. 
Business- 


man 








Research Sci- 
entist vs. 
Businessman 


| Thorough 
Studious 
Precise 
Intellectual 
Orderly 


Charming 
Tactful 
Friendly 
Humorous 
Economical 





Persistent 
Accurate 


| Logical 


| Studious Economical 
Lawyer Intellectual 
Business- | Poised 
man Precise 
Thorough 
| Logical 
Persistent 
| Tactful 
Tolerant 


| 
| 
| 
| 











and it was found that the difference score 
means fell into four groups. The difference 
mean of Form D was significantly (.05 
level) larger than the difference means of 
Forms A and B. Forms A and B were 
significantly different from Forms C and 
E, and the mean difference score of Form 
F was significantly lower than the mean 
difference scores of Forms C and E. The 
differences in means between Forms A 
and B and also between Forms C and E 
were not significant. The research sci- 
entist and the lawyer were quite similar 
in difference (“favorability”) scores as 
were the engineer and businessman. How- 
ever, the scientist-lawyer pair of occupa- 
tions were quite distinct from the en- 
gineer-businessman in mean difference 
scores. 

Analyses were performed for the 30 


desirable traits on each form to identify 
the single traits that distinguished between 
each pair of occupations. The “more” and 
“less” percentages (NV = 23) were com- 
puted for each trait on each form and if 
either percentage was greater than 57 per 
cent the trait was selected as being a 
discriminating item. The results of these 
individual trait analyses can be found in 
Table 3. It seems apparent that stereo- 
types exist for all four occupations. The 
scientist and lawyer appear similar in 
the more intellectual traits, but the sci- 
entist lacks the warm social graces that 
characterize the lawyer stereotype. The 
engineer is a junior edition of the scientist 
while the businessman lacks the intellec- 
tual qualities of the lawyer, but shares 
many of his social traits. 

Since our basic interest was in the 
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stereotype of the research scientist, the 
traits that most consistently discriminated 
the scientist from the other three occupa- 
tions are given in Table 4. Again the 
same stereotype pattern appears: the sci- 
entist, along with the lawyer, has more of 
the socially desirable intellectual and work 
habit traits than does the engineer and 
the businessman, while the scientist, like 
the engineer, has less of the social graces 
than does the lawyer and businessman. 
The 12 traits listed in Table 4 appear to 
constitute the core stereotype that the 
Ss had regarding the research scientist. 


DIscussION 


The evidence that our college Ss con- 
sistently attribute certain personality 
traits to the occupations of research sci- 
entist, engineer, lawyer, and businessman 
confirms the results of other studies (1, 
9) where different occupations and dif- 
ferent methodologies were used. The find- 
ing of most general interest was that the 
Ss discriminated among the occupations 
only for socially desirable traits and not 
for socially undesirable traits. Whether 
this implies reluctance on the part of the 
Ss to say that one occupation has more 
of these undesirable traits, or hesitation 
in attributing relative freedom from un- 
desirability traits to the paired occupation, 
cannot be determined from our data. The 
implications for research where the S is 
required to attribute personality traits to 
himself and also to other people are ob- 
vious. 

The analysis of individual traits com- 
prising the stereotypes of the personalities 
of these four occupations suggests that 
the occupations were discriminated along 
two relatively independent dimensions. 
The traits can be grouped into (a) those 
referring to intellectual and work habits 
characteristics and (b) those related to 
social personality traits that arise pri- 
marily in interpersonal relations. Research 
scientists are viewed as being high (having 


TABLE 4 
PERCENTAGES OF Suspsects SAYING THAT 
Certain Socratty DesirRaB_Le TRAITS 
DISTINGUISH THE RESEARCH 
SciENTIST FROM MEN IN 
OrTHER OccuUPATIONS 


Than the Average 


Research Scientist | — a 


is More , 
Business- 


man | Engineer | Lawyer 


Intellectual 70 

Logical 61 
Orderly 70 | 48 
Persistent | a 7 

Precise | 87 | 78 
Studious 91 | 43 
Thorough | 96 39 


Than the Average 


Research Scientist is 


ss | : | 
Le Business- 


men Lawyer 


Engineer 


Charming | 7 34 83 
Friendly 65 | 39 57 
Humorous 61 | @ 70 
Poised 52 30 74 
Self-confident | 35 48 61 





more of the traits) on the intellectual di- 
mension and as being low (having fewer 
of the traits) on the social axis. Engineers 
are not as intellectual as scientists, but 
members of both professions are equally 
lacking in social graces. The lawyer is 
equally high on both axes, while the busi- 
nessman is low on the intellectual dimen- 
sion and high on social traits. Whether or 
not other occupations can be located 
within this two-dimensional system, or 
whether other dimensions would have to 
be added are questions for further study. 

The 12 traits listed in Table 4 appear 
to offer a possibility of developing a short 
objective scale for measuring the extent of 
the stereotype of the research scientist held 
by individual Ss. The same procedure used 
in this study could be repeated by ad- 
ministering Forms A, B, or C to Ss and 
scoring each S as to how many of the 
first seven traits the S says the scientist 
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has more of and how many of the last 
five traits he indicates the scientist has 
less of. Such a short 12-item scale may 
lack sufficient reliability for research pur- 
poses, but, if reliable, would offer a method 
of quantifying this aspect of attitudes for 
further research. 

It is particularly interesting to note 
that the Ss displaying this stereotype of 
the research scientist were pre-education 
students who, in a few years, will be 
teaching children and adolescents from 
whom the next generation of scientists 
must be recruited. If this stereotype con- 
tinues to be transmitted from teacher to 
student, the problem of interesting high 
school students in scientific careers will 
remain with us. 


SuMMARY 


Pre-education college Ss (NV = 138) 
were asked to compare two of four oc- 
cupations (research scientist, engineer, 


lawyer, businessman) as to whether the 


average members of the paired occupations 
would have more, less, or an equal amount 
of each of 60 personality traits. Equal 
numbers of Ss (N = 23) responded to 
each of the six possible pairs of occupa- 
tions. The traits were evenly dichotomized 
into socially desirable and socially un- 
desirable groups on the basis of an a 
priori selection which was validated by 
having another group of Ss (N = 54) rate 
the traits for social desirability. The num- 
ber of socially desirable traits attributed 
to each occupation discriminated among 
the occupations (.001 level), but the 
socially undesirable traits did not. Analy- 
ses of the 30-individual socially desirable 
traits indicated that the Ss viewed the 
scientist and lawyer as having more of 
the intellectual traits while the lawyer and 
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the businessman were perceived as having 
more of the desirable interpersonal traits. 
Both the scientist and engineer have less 
of the interpersonal traits and the busi- 
nessman has few of the intellectual traits. 
The most consistent stereotype in this 
study regarded the research scientist, when 
compared with the other three occupa- 
tions, as being more intellectual, logical, 
orderly, persistent, precise, studious, 
thorough, and also as being less charming, 
friendly, humorous, poised, and_ self- 
confident. 
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AUDITORY ABILITIES AND ACHIEVEMENT IN SPELLING 
IN THE PRIMARY GRADES' 


DAVID H. RUSSELL 
University of California, Berkeley 


School people are necessarily concerned 
with language habits and attitudes that 
develop in the primary grades. When these 
learnings involve spelling they may influ- 
ence a child’s achievement in written lan- 
guage not only in the beginning grades but 
throughout later years. In his first years 
in school the child utilizes language skills 
acquired in preschool years and may, 
therefore, rely heavily on auditory clues 
to words he must spell. There is some evi- 
dence that auditory abilities are more im- 
portant for spelling in the lower grades 
than they are by the time the child 
reaches the seventh grade level of spelling 
ability (11). Evidence of the close rela- 
tionship between auditory and spelling 
abilities in the primary grades has been 
suggested by such investigators as Brad- 
ford (1) and Russell (10). Bradford used 
a revised paper-and-pencil test of indi- 
vidual vowel and consonant sounds and 
blends of “regularly spelled” words and 
found considerable growth between the 
first and second grades. Russell found 
correlations between spelling and auditory 
tests ranging from the .20’s to .80’s in a 
second grade group. Typical of the in- 
vestigations in the area is one by Rudisill 
(9) which found a correlation of .69 be- 
tween spelling and phonic knowledge for 
a group of 315 third grade children in 
North Carolina. In another study of the 
effects of phonic training at the second- 
grade level, Zedler (13) found improve- 
ment, after 14 hours of instruction, in both 
spelling scores and speech-sound discrimi- 
nation abilities. 

Although spelling has usually been re- 
garded as one of the simpler skills ac- 


* Barbara J. King and Gerald M. Meredith 
assisted in collecting and processing the data. 


quired in school, one with a heavy loading 
of associative learning, Horn (7) and 
others have shown just how intricate and 
complex the relationship between sounds 
and letters may be. The apparently plain 
injunction to combine phonetic analysis 
in spelling and reading activities is not 
so simple nor so direct as it seems. Horn, 
for example, has listed six types of evidence 
which must be considered in relating audi- 
tory characteristics of words to their spell- 
ing and has presented facts about three 
of them: (a) the variation in pronouncia- 
tions of the “same” words, (6) the dif- 
ferent ways in which the various English 
sounds are spelled, and (c) the ways chil- 
dren actually spell sounds in common 
words. He concludes, for example, that 
there is little justification for the claim 
that children can spell the words they can 
pronounce and therefore believes that di- 
rect teaching of the large number of ir- 
regularly phonetic words is inevitable. 

Since many words must be taught di- 
rectly in the lower grades, the question 
every teacher faces is that of how to get 
children to study the words efficiently. 
Shall this child be encouraged to rely on 
visual techniques? Does that child do 
better with auditory techniques, and if 
so, which ones should he use? The present 
study is concerned with identifying audi- 
tory techniques which a child is most 
likely to find useful at the primary-grade 
level. 


PROCEDURES 


To explore further the relationships be- 
tween auditory abilities and spelling, 97 
children in the first three grades of an 
Oakland, California, school were tested. 
The numbers used were Grade I, 30; 
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Grade II, 32; and Grade III, 35. Since 
some of the tests were not useful for 
children reading at the pre-primer level, 
the results in the study are obtained largely 
from a sample of §5 children with some- 
what less than a third of the group in 
first grade. The children came largely from 
middle or lower-middle class homes. In 
the three grades the range in CA was from 
6-8 to 10-2, in MA from 6-7 to 10-2, and 
in IQ from 83 to 119. 

The following standardized tests were 
administered: 


1. The Kuhlmann-Anderson Intelligence 
Tests 

2. The 
Spelling 

3. The Durrell-Sullivan Reading Capacity 
Test 

4. The Gates Primary (and Advanced 
Primary) Reading Tests, Types I, II and 
Ill. 


In addition, six tests of auditory dis- 
crimination were given to the children. 
Since not all of these have been published 


California Achievement Test— 


they are described briefly, with examples. 
The group tests were: 


1. Caffrey-Russell Auditory Discrimina- 


The teacher 


tion Test I Same-Different. 
reads pairs of words such as “shown-sewn, 
“style-style” and “mobbed-mopped” and the 
child marks whether they are the “Same” 
or “Different.” (This test proved to be too 
easy for the group.) 

2. Caffrey-Russell Auditory Discrimina- 
tion III telling whether words are different 
in initial, middle, or final sounds. The chil- 
dren mark 1, 2, or 3 (corresponding to initial, 
middle, final) on an answer sheet for such 
pairs as “butter-buzzer,” “pits-pitch,” and 
“shoed-chewed.” 

3. Durrell Test of Hearing Sounds in 
Words. This test (2) consists of three sub- 
tests: (a) marking the printed word which 
has an initial sound the same as a word 
given orally. Example: The teacher says 
“top” and the children mark one letter of 
p, b, t, n, a. (b) Marking the printed word 
which has the same final or beginning sound 
as a word given orally. Example: The 
teacher says “happen” and the child marks 
one of hexameter, generation, and hydrogen. 
(c) The pupil draws a circle around all 
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phonetic elements (such as letters, blends, 
phonograms) heard in a given word. For 
example the teacher says “blinding” and 
the child marks the following: ind r bl x t 
ing. 

4. Durrell-Sullivan Reading Capacity 
Test. This is a test of comprehension of 
paragraphs read orally which might be called 
a listening or auditory test rather than a 
reading test. The eight paragraphs used were 
modified slightly from the Durrell-Sullivan 
Capacity Test so that raw scores were used 
in computation. Each paragraph was fol- 
lowed by five oral questions in which the 
child marked one of three possible answers. 

Two tests were given individually. These 
were: 

5. The Gates test of Giving Words with 
Stated Initial Sounds described in The Im- 
provement of Reading (3) in which the S is 
asked to name three words which begin like 
each of three words suggested by the ex- 
aminer. 

6. Gates test of Giving Words with Stated 
Final Sounds which is described in the same 
source. The child is asked to say three words 
which rhyme with each of three words sug- 
gested by the examiner. 


REsvuLtTs 


Table 1 gives the zero-order correlations 
for the various tests with spelling scores. 
The table indicates that the reading tests 
as a group correlate more highly with 
spelling than the individual auditory tests 
but that the combined group or battery of 
auditory tests correlate with spelling as 
highly as the Gates reading tests. The table 
further suggests that, for this group, the 
best test of auditory abilities in relation to 
spelling is the Durrell test, composed of 
three subtests. Chronological age and men- 
tal age do not seem closely related to spell- 
ing ability. In general, the results suggest 
that rather complex auditory abilities in- 
volving sound recognition in various parts 
of a word are more closely related to spell- 
ing ability than is recognition of sounds of 
whole words as in same-different or rhym- 
ing tests. The close relationship of the 
Gates reading tests, especially in word 
recognition, to spelling tends to confirm 
an earlier finding (10). In addition to 





AUDITORY ABILITIES AND SPELLING ACHIEVEMENT 


TABLE 1 
CORRELATIONS OF VARIOUS FACTORS WITH 
SPELLING ABILity FOR 85 CHILDREN IN 
Graves I, II, anp III 








\Zero-order 
Variable |? With 
| Spellings 


General 
1. Kuhlmann-Anderson 
tal Age* 
2. Chronological Age a, 
Reading 
3. Gates .63 
Type I 
4. Gates | 87 
Type II 
5. Gates 
Total 
Auditory 
. Caffrey-Russell I 
. Caffrey-Russell III 
. Durrell Sounds 
. Gates Initial Sounds 
. Gates Rhyming 
. Listening Comprehension 
(Durrell-Sullivan) 
. Auditory Total (Items 6 to 
11) 


Men- | 31 





* For n = 85, r = .22 significant at 5% level, r = .28 
significant at 1% level. 
*n = 58. 


these calculations, the raw scores of the 
auditory tests were converted to standard 
scores, but the correlations computed gave 
about the same correlations as the raw 
scores. 

Table 2 illustrates the use of the coeffi- 
cient of multiple correlation to estimate 
the relationship to spelling of four com- 
bined auditory tests. The contribution to 
variance may be computed by multiplying 
the zero-order correlation by its standard 
partial regression coefficient. In this study 
the Doolittle Method was used in com- 
puting the standard partial coefficients as 
described in Walker and Lev (12, pp. 326- 
331). Once computed, the standard par- 
tial regression coefficients are inserted in 
the conventional formula for the multiple 
correlation, and a measure of contribution 
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of multiple factors to a set of scores may 
be estimated. Table 2 illustrates that fac- 
tors other than auditory abilities account 
for spelling achievement in the group 
tested but that the relationship of auditory 
abilities to spelling ability is a significant 
one. 

Table 3 gives some further information 


TABLE 2 
MvcLTIPLE CORRELATION AND CONTRIBU- 
TION TO VARIANCE OF SPELLING BY 


Zero- 
order r 
With 
| Spelling 


Cum. iC ontri- 
Mult. | bution 
Corre- |to Var- 
lation | iance 


Auditory Test 


. Durrell Sounds | .66 | 35% 
(1) plus | 
. Caffrey - Russell | .51 | . 13 
{II | 
(1) and (2) plus | 
. Gates Initial | .29 | . 3 


Sound 
(1) and (2) and (3) 
plus 
4. Caffrey-Russell I .22 


Total Variance Ac- 
counted for 





* Significant at the 1% level of confidence 


TABLE 3 
INTERCORRELATIONS OF AvupITORY TEsTS 





Total 
| | udi- 
| vi | tory 
Score 





nimiwiv 


| } 

.18 |.47 |.36 |. 41 
49 |.42 |.25 |. 69 
AS 1.42 |.29 |.: .79 
IV 47 |.42 |.42 | .43 |. 65 
Vv .36 |.25 |.29 |.43 I. .48 
VI i. .26 |.28 |.35 |. 42 








I Caffrey- Russell Auditory Discrim- 
ination Test I 
II Caffrey-Russell Auditory Dis- 
crimination Test III 
III Durrell Sounds in Words Test 
IV Listening Capacity Test (adapted 
from Durrell-Sullivan) 
V Gates Giving Words With Same 
Initial Sound 
VI Gates Giving Words that Rhyme 


Note.—Code: 
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about the interrelationships of the audi- 
tory abilities involved in this study. It 
indicates that the combined score ob- 
tained on the battery of three Durrell tests 
is most closely related to the total auditory 
scores. As mentioned above, the Caffrey- 
Russell I test as administered was too easy 
for this group with a considerable num- 
ber of top scores reducing the size of the 
correlations and the discrimination value 
of the test. It should also be noted that the 
Listening Capacity Test, which was a 
measure of comprehension of verbal ma- 
terials, had a fairly high correlation with 
the other auditory perception tests. 


CONCLUSIONS 


This study of 85 children in the first 
three grades revealed that some auditory 
abilities are significantly related to spell- 
ing abilities at the one per cent level of 
confidence. It began an exploration of spe- 
cific auditory abilities which are most 
closely related to spelling achievement and 


found that these were rather complex abili- 
ties involving word parts rather than whole 
words. A battery of three Durrell tests of 
word sounds and the Caffrey-Russell III 
test which involved recognition of like- 
nesses in initial, middle or final positions 
were closely enough related to spelling 
ability to warrant further study. There is 
considerable evidence that a group of audi- 
tory abilities can be good predictors of 
spelling success in the primary grades but 
the constituents of this group must be 
studied more broadly and more exactly. 
The hypothesis that different children have 
quite different auditory abilities and there- 
fore should be taught spelling, and possibly 
reading, with different kinds of auditory 
techniques also needs further testing. 

In addition to the role of auditory abili- 
ties in spelling achievement, the investiga- 
tion has confirmed earlier results of the 
close relationship between the Gates tests 
of primary reading and spelling ability at 

_ this age. On the other hand, the factors of 
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chronologicai age and mental age within 
this fairly narrow age range are not sig- 
nificant factors in spelling ability. 

The relationship of listening compre- 
hension of oral paragraphs to the auditory 
and spelling tests is a puzzling one. Better 
measures of listening or auding ability are 
needed. If the test used in this study is a 
valid one it appears that the ability to 
listen to paragraphs with comprehension 
is not closely related to spelling ability 
(r = 33) but that it is fairly closely re- 
lated to the combjned auditory perception 
or discrimination scores (r = .65). This 
fact, and the close relationship of the spell- 
ing scores to the Gates word recognition 
test indicate the presence of other factors, 
possibly visual discrimination abilities, 
which were not considered in the present 
study. 

The results further indicate the need of 
complete exploration of different kinds of 
phonetic and auditory abilities and their 
relations to spelling achievement. In addi- 
tion to the six measures used in this study 
possible tests include other tests devised 
by Bradford (1), by Durrell (2), by 
Holmes (6), by Roswell-Chall (8) and 
others. The present study suggests that 
the simpler skills of detecting same-differ- 
ent word pairs or suggesting rhymes are 
not so closely related to spelling achieve- 
ment as are more complex auditory abili- 
ties such as identifying sounds in various 
parts of words. Knowing when similar 
sounding syllables are alike and different, 
and knowing the various ways a syllable 
may be spelled once it has been recognized, 
make the apparently simple process of 
spelling more complex than it first seems. 


SuMMARY 


The relation of scores on the six tests 
of auditory discrimination, sometimes la- 
belled “phonetic skills,” to scores on in- 
telligence, spelling, and reading tests was 
determined for 85 children in the first three 
grades of a California school. The results 
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indicated that some verbal auditory skills 
are significantly related to both spelling 
and reading ability and that these abilities 
involved recognition of word parts rather 
than whole words. The relationship of 
listening comprehension of paragraphs to 
spelling scores was much lower. Considera- 
ble contribution to spelling variance was 
unaccounted for indicating the possibility 
that visual discrimination factors may be 
important in spelling or that a wider range 
of specific kinds of auditory skills should be 
tested probably in relation to both spelling 
and reading. 
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In several studies, the Stanford-Binet, 
Form L, and the Wechsler Intelligence 
Seale for Children (WISC) were adminis- 
tered to the same children, at the same 
age, by the same examiner (2, 5, 7, 9, 
10, 11, 12, 13, 15). The following results 
were obtained: (a) The median correla- 
tion between the Stanford-Binet and the 
WISC Full Seale IQ was 85. (b) Highest 
intertest correlations obtained for the 
WISC Full Scale, next highest for the 
WISC Verbal Scale, and lowest for the 
WISC Performance Scale (2, 5, 7, 10, 11, 
12, 13). (c) Mean Stanford-Binet IQs 
were significantly higher than mean WISC 
IQs (10, 11, 12, 13). (d) Significantly 
greater intertest discrepancies occurred at 
the high IQ and low age levels (10). 

Previous finding may require modifica- 
tion before application to the situation 
where retesting occurs at different ages 
with different examiners. The present 
study provides data to determine whether 
such modification is necessary. Preschool 
Stanford-Binets are compared with school- 
age WISCs. Comparing these two age 
levels has additional practical interest, be- 
cause the WISC cannot be used before 
age five, so that the Stanford-Binet remains 
the only major preschool intelligence test. 


* This investigation, part of a larger re- 
search project, was supported by research 
grant (3B-9007) from the National Institute 
of Neurological Diseases and Blindness, of 
the National Institutes of Health, United 
States Public Health Service. 

The authors wish to express their apprecia- 
tion to William Langford for his guidance 
and generosity in providing the facilities 
to make this study possible, and to Arthur 
Carr, and Joseph Zubin for their helpful 
advice. 





COMPARISON OF PRESCHOOL STANFORD-BINET 
AND SCHOOL-AGE WISC IQS" 


FRANCES FUCHS SCHACHTER AND VIRGINIA APGAR 


College of Physicians and Surgeons, Columbia University 


SUBJECTS AND PROCEDURE 


Subjects (Ss) were randomly selected 
from a clinic population born at Sloane 
Hospital for Women, previously described 
by Apgar et al. (1). At preschool age 
(mean = 49.4 months, sigma = 5.9), Sz 
were asked to return to the hospital for 
Stanford-Binets. At school age (mean = 
100.2 months, sigma = 6.3), Ss were asked 
to reappear for WISCs. The average in- 
terval between tests was 50.8 months 
(sigma = 2.2). 

Of 404 Ss selected, 119 returned for both 
tests in response to standard mail requests. 
Six Ss were excluded from the sample. Two 
were not testable, three had possible brain 
damage occurring in the intertest interval, 
and one had an intertest interval exceeding 
the mean interval by five sigmas. The 
final sample numbered 113, 61 males and 
52 females. There were 39 white Ss, 66 Ne- 
groes, 6 Puerto Ricans, and 2 Orientals. 

One psychologist administered all Stan- 
ford-Binets, Form L, at the preschool age. 
Another administered all WISCs at school 
age. Stanford-Binet scores were withheld 
from the WISC examiner until testing was 
completed. 


ResvuLts AND Discussion 


Intertest Correlations 


The Stanford-Binet IQs correlated .67 
with the WISC Full Scale IQs (p < 01). 
Though significant, the relationship is con- 
siderably lower than .85, the intertest cor- 
relation at the same age, with the same 
examiner. However, the .67 correlation 
compares favorably with previously re- 
ported correlations between preschool and 
school-age Stanford-Binets. The median 
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correlation obtained from the latter re- 
ports was found to be .74 (3, 4, 6, 8) 
The Stanford-Binet IQs correlated .64 
with the WISC Verbal Scale and 48 with 
the WISC Performance Seale IQs (both 
p's < 01), both lower than the correlation 
for the Full Seale IQs. This hierarchy of 
correlations agrees with previous findings. 


Comparison of Mean IQs 


Table 1 provides the data to compare 
the mean IQs for the present sample. It 
can be seen that the mean Stanford-Binet 
IQ was significantly higher than all three 
mean WISC IQs. This finding too agrees 
with previous reports. 

However, in the present study, with 
one examiner administering all Stanford- 
Binets, and another all WISCs, it can be 
argued that the mean differences reflected 
examiner bias rather than intertest differ- 
ences. Data were available to evaluate this 
alternative interpretation of the findings. 
If the results reflected examiner bias, one 
would expect lowest intertest correlations 
and largest intertest differences where 
scoring was most subjective. Since the 
WISC Verbal Scale entails greater sub- 
jective scoring than the Performance Scale, 
one would predict a lower intertest corre- 
lation and larger intertest differences for 
the former than the latter. Actually the 
reverse was found. The Verbal Scale cor- 
related better with the Stanford-Binet and 
showed a smaller mean intertest difference 
(Table 1) than the Performance Scale. 
Thus, the hypothesis of examiner bias does 
not appear to be supported by the data. 


Effect of 1Q, Age, and Sex 


An analysis of variance was performed 
to see if the observed difference between 
the mean Stanford-Binet and WISC Full 
Seale IQs was related to IQ, age, or sex. 
Three Stanford-Binet IQ subgroups were 
created, below 90, 90 to 110, and above 
110. Two age subgroups were formed, be- 





TABLE 1 
CoMPARISON OF MEAN STANFORD-BINET 
anp WISC IQs 








(N = 113) 
WISC 
Measure S-B ee —S=— 
FS | VS | PS 
it ns |__| 
Mean 104.32:98.94 100.14 (97.87 
SD 15.9611.26 | 11.40 |13.07 


t:8-B vs. wise | 4.89% 3.60°| 4.57* 





* Significant at the .001 level 


TABLE 2 
Errecr or IQ, AcE, AND SEX ON 
Mean InTERTEST DirrFERENCES 


Variable N Mean* PF 
8-B IQ 12.83* 
<9 19 —2.68 
90-110 59 +2.39 
>110 35 +14.86 
WISC Age 1.12 
>s M +4.7 
<8 29 +7.24 
Sex 1.50 
Male 61 +3.64 
Female 52 +7.46 


* Minus (—) denotes WISC higher than 8-B; plus 
(+) denotes 8-B higher than WISC 
* Significant at the .001 level. 








low eight years and above eight. The num- 
ber in each subgroup is shown in Table 2. 
Since the numbers were unequal, it was 
necessary to use the Walker-Lev (14) ap- 
proximate method of analysis of variance.” 

The results shown in Table 2 indicate 
that age and sex had no effect on intertest 
differences, while IQ did. Individual t 
tests revealed that the difference between 
low and average I1Q levels was significant 


- at the .05 level (t = 2.26), while the dif- 


* Since some Ss had higher Stanford-Binet 
1Qs relative to their WISC IQs while others 
had higher WISC IQs, intertest differences 
were transformed to a unidirectional scale 
to calculate the analysis. The scale used as- 
sumed that zero intertest difference equals 
30 points. 
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ferences between low and high IQ levels 
(t = 6.59) and average and high IQ levels 
(t = 5.47) were significant at the .01 level. 
None of the interactions was significant. 
The results indicate that increments in the 
preschool Stanford-Binet IQ increase the 
likelihood that it will be significantly 
higher than the school-age WISC IQ, and 
that the greatest intertest discrepancies 
occur at the high IQ levels. 

The finding of greatest intertest differ- 
ences at the high IQ levels agrees with 
previous research. The data on age appear 
to differ with the previous report (10) of 
a greater intertest discrepancy at low ages. 
However, since the age range of the previ- 
ous study was eight years larger than that 
of the present study, the negative findings 
may be attributed to the relative homo- 
geneity of the sample. There have been no 
previous reports of sex differences in rela- 
tion to differences between the Stanford- 
Binet and the WISC. 


Effect of Race and Nationality 


Since the sample contained both white 
Ss and Negroes, it was possible to study 
the effect of race on intertest differences. 
The results showed that both white Ss 
and Negroes obtained higher Stanford- 
Binet IQs relative to their WISC IQs, 7.74 
and 5.18 mean points, respectively. A t 
value of 1.02 indicated no significant dif- 
ference between the means for both races. 

Though the sample also contained six 
Puerto Ricans and two Orientals, their 
number was too small to permit compari- 
son. However, it was necessary to demon- 
strate that these eight Ss did not signifi- 
cantly affect the results for the remaining 
sample. A comparison of samples includ- 
ing and excluding the eight Ss showed that 
both samples obtained higher Stanford- 
Binet IQs relative to their WISC IQs, 5.60 
and 4.60 mean points, respectively. A t 
value of .70 comparing the means was not 
significant, indicating that the eight Ss 
did not significantly affect the results. 


STANFORD-BINET AND WISC IQS 


SuMMARY 


Previous investigators have compared 
the Stanford-Binet and WISC IQs of Ss 
retested at the same age by the same ex- 
aminer. The present study attempted to 
ascertain whether previous findings apply 
to the situation where retesting occurs at 
different ages by different examiners. 

Ss were randomly selected from a neo- 
natal clinic population in New York City. 
One psychologist administered all Stan- 
ford-Binets at preschool age; another all 
WISCs at school-age. 

Results show that the intertest correla- 
tion is decreased from .85, for retesting at 
the same age by the same examiner, to .67, 
for retesting at different ages with different 
examiners. However, the .67 correlation 
compares favorably with similar retest 
findings for the Stanford-Binet itself. In 
other respects, results support previous 
findings. Highest intertest correlations were 
obtained for the WISC Full Scale IQ, low- 
est for the WISC Performance Scale IQ; 
the mean Stanford-Binet IQ was signifi- 
cantly higher than the mean WISC IQ; 
and greatest intertest discrepancies oc- 
curred at high IQ levels. The results ap- 
peared comparable for white Ss and Ne- 
groes. Further, a small group of eight 
Puerto Rican and Oriental Ss did not ap- 
pear to affect the findings. 
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CONSERVATION OF TEACHING TIME THROUGH THE USE OF 
LECTURE CLASSES AND STUDENT ASSISTANTS 


RUTH CHURCHILL 
Antioch 


In 1956-7, a member of the mathematics 
department at Antioch College’ became in- 
terested in contrasting what students 
learned when they were taught in small 
lecture-diseussion sections with a labora- 
tory led by the instructor and what they 
learned when the instructor lectured to a 
large class which was led in small group 
discussions and laboratories by a student 
assistant. He hoped that two kinds of sav- 
ings in teaching time could be made: first, 
each section takes as much teaching time 
and time used in preparation as does a 
single large lecture group. Second, if up- 
perclass students can substitute for in- 
structors in the laboratory, the instructor 
is freed for additional hours by a less 
highly skilled person. In the experimental 


year, teaching by sections took 18 hours a 
week of faculty time; teaching by lecture 


and student-led laboratories took four 
hours a week of faculty time and ten 
hours of a student assistant’s time. 

The specific hypotheses formulated were 
that: 

1. Students would learn skills and un- 
derstandings relevant to the objectives of 
a course in fundamentals of mathematics, 
and this learning would be independent of 
the method of teaching the course. The 
specific methods contrasted were small 
lecture-discussion sections with a labora- 
tory, all led by the instructor, and large 
lecture class, with discussions and labora- 


* Acknowledgements should be made to 
Gustave Rabson, who initiated the experi- 
ment and taught the classes involved; to 
Joan Pomerantz, senior major in government 
and mathematics, who served as the labora- 
tory assistant; and to Lawrence Balch, 
mathematics major, who served as an essay 
grader. 
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College 


tory led by an undergraduate student as- 
sistant. 

2. Student attitudes towards the course 
would also be independent of the method 
of teaching, that is, equally satisfying situ- 
ations could be created under both meth- 


ods. 
METHODS 


The following procedure was set up: In 
one division’ the mathematics course was 
taught in two sections, ranging in size from 
20 to 30 students, in the usual manner, 
three meetings a week in which lectures 
by the instructor were combined with 
questions and discussion by the students. 
In addition, there was a weekly laboratory 
session (usually an hour long), also led by 
the instructor. In the other division, the 
instructor lectured twice a week to a class 
of about 70 students. This group had two 
laboratory sessions each week, in which 
all discussion, questions, and help were 
handled by a student assistant. 

Two aspects of learning in the course 
were selected for evaluation: background 
in skills and understanding of the nature 
of mathematics. A 126-item multiple- 
choice test used to measure skills. 
Understanding of the nature of mathe- 
matics, considered especially important in 
terms of the objectives of the course, was 
measured by a short essay. 

Student attitudes towards the 
were measured by student ratings of the 


was 


course 


*At Antioch, because of the cooperative 
work-study plan, the student body is in two 
divisious, which alternate on campus, one 
division studying while the other is away 
working. Thus, the two groups, or divisions, 
of students in the experiment were not on 
campus at the same time. 
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instructor and by direct questions about 
the course. The student evaluation of the 
instructor employed a rating scale involv- 
ing five ratings: clarity of presentation, in- 
terest in the student, arousing interest in 
the subject matter, making learning active, 
and knowledge of the subject matter. Stu- 
dent evaluation of the course consisted of 
three open-ended questions: What aspects 
of the course did you like most? What as- 
pects of the course did you like least? In 
what ways could the course be improved? 

Students took the background test and 
wrote essays at the beginning and the end 
of the course, which was 20 weeks long. 
The instructor was rated in the fourteenth 
week of the course, and the course ques- 
tionnaire administered in the last week. 

Unfortunately, the method used for an- 
swering and scoring the background test 
did not permit ascertaining its reliability 

Students were instructed not to 
but to mark all answers possibly 
The right answers were weighted to 
equal the sum of the possible wrong an- 
swers; the score was the weighted sum of 
right answers minus one point for each 
wrong answer. The only available data 
bearing on thé reliability of the test was 
a correlation of .74 between pre- and post- 
test scores for 18 students in a section of 
the course not in the experiment. On the 
whole, the test is probably sufficiently re- 
liable to detect group differences. 

Since there are no objective standards 
for measuring understanding of mathe- 
matics, the validity of the essay as a meas- 
ure of understanding depended on the 
grading scale evolved and its reliability. 
The procedure for grading the essays was 
to group together all the essays, pre- and 
posttest, from sections and lecture class, 
for each ofthe four topics and to grade 
each topic separately. All identification 
both of student and of time was removed 
from the papers. Two graders were used; 
they had available model essays written 
by the instructor; and they agreed on a 


easily. 
guess 
right. 
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general definition of understanding of 
mathematics. The correlation between the 
two graders was .67. Since the sum of the 
two grades was used as the final grade, the 
reliability of the essay corrected by the 
Spearman-Brown formula for doubling 
the length is .SO. 


RESULTS AND Discussion 


1. The results in Table 1 indicate that, 
in terms of pretest scores on the back- 
ground test and on the essays, students 
taking the course in sections did not differ 
from those taking it in the lecture class. 

2. The data in Table 1 indicate that 
both the sections and the lecture class 
gained significantly from pre- to posttest 
on the background test and on the essays. 
They further indicate that the sections 
did differ from the lecture class in 
amount of gain. 

53. When the student ratings of the in- 
structor for the two sections are com- 
pared in Table 2 with the ratings for the 
lecture class, two significant differences 
ire apparent. The large lecture class rated 
the instructor significantly poorer in clar- 
ity of presentation, and in general they 


not 


TABLE 1 
Pre- AND Postrest Scores anp GatIns 
Mape By Sections aNp Lecture CLAss ON 
THE BackGrounp Test AND Essays 


Sections A Lecture 
and B class Sign 
Test of 


Diff 
Mean SD |Mean SD 


(N = 59) 


Background 
Test 
Pretest 
Posttest 
Gain 


(N = 47) 


115.6)36.8 113.9]40.3) 0.6 
192.8)42.5 187.8}47.9 0.2 
77.1/36.1| 73.9/40.4) 0.3 
14.50* | 13.94" 
(N = 41) |(N = 54)| 
| 17.0 7.6) 18.6) 6.4 
| 22.7) 4.6) 24.9) 7.1 
| 5.71 8.0} 6.3] 7.7 
| 6.0° | 


laain 

Essay Test 
Pretest 
Posttest 
Gain 

Gain 4.5* 


* Significant at the 1% level. 
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TABLE 2 
OF THE INSTRUCTOR 


StupENT RatTInGs 








Lecture 
class 
(N = 55) 


Sections A 
and B 
(N = 46) 


Sign. 
of 
a aa Diff 
F 


Mean, SD 


(2.0/1.1 


Mean 


Presents ma- 3.0 
terial clearly | 

Displays inter- | 1 
est in stu- | 
dent 

Arouses inter- 2.2; 1.2 
est in subject | 
matter 

Makes learning | 2.0 | 1.2 
active 

Knows 
rial 

Over-all 


711.3 


2.0 


mate- | 1.5 | 0.9 


/9.5 | 4.2 





Note.—The lower the rating, the more favorable. 
* Significant at the 1% level. 


rated him slightly poorer on all traits so 
that the over-all rating in the lecture class 
is significantly poorer than that received 
in the sections. In both classes, however, 
the instructor was rated well above aver- 
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age when compared with a sample of the 
whole faculty. 

Table 3 summarizes both the comments 
made by students on their ratings of the 
instructor and their answers to the three 
open-ended questions used to rate the 
course rather than the instructor. Stu- 
dents in both the small sections and the 
lecture class responded to the questions 
on the most and least liked aspects of the 
course predominantly in terms of content. 
When attention is focused on the instructor 
rather than the course, content drops out 
as a relevant category. Other than this 
major difference, evoked by the different 
structuring of the two situations, the two 
kinds of comments were similar. 

The instructor’s presentation of the ma- 
terial was clearly the most important vari- 
able present in both kinds of comments. 
Presentation was commented on more of- 
ten favorably and less often unfavorably 
in the sections than in the lecture class, 
significantly so in the case of unfavorable 
comments on the instructor. Another ma- 
jor factor, mentioned only unfavorably, 
was the rapid pace of the course. When 


TABLE 3 
PERCENTAGE OF STUDENTS IN Sections (S) anp Lecture (L) Maxine Eacu 


ComMMENT ON CouRSE AND INSTRUCTOR 








On course (NV = 48S, 56L) 


On instructor (V = 46S, SSL) 





Aspects liked— 


Favorable 





Least 


| Unfavorable 





Content 
Presentation 
Pace 
Laboratories 
Class size* 
Examinations 
Everything good 
Miscellaneous 


npoenn BIS» 





BPSese 


tN ore 3 dO 











*x? yielded difference significant at 5% level. 
**y? yielded difference significant at 1% level. 
® Class size too large placed in unfavorable categories. 
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commenting on the instructor, the lecture 
class made significantly more unfavorable 
comments on pace. The significantly 
greater number of unfavorable comments 
made by the lecture class on presentation 
and pace support the significantly poorer 
rating which they gave the instructor on 
clarity of presentation. 

The sections and the lecture class dis- 
agreed markedly about the laboratory, 
which was a favorable feature for the lec- 
ture group but not mentioned by the sec- 
tions. This difference can be accounted for 
in terms of differences in instructional pro- 
cedure: for the sections the laboratory 
was only another meeting with the instruc- 
tor while in the large class the small lab- 
oratory groups, which met with the stu- 
dent assistant, were a distinct feature. 

In respect to the hypothesis relating to 
student attitudes towards the course the 
final position must be a qualified rejection 
of the null hypothesis. The lecture class 
was somewhat less satisfied, particularly 
in respect to clarity of presentation; but 
the lecture class commented favorably on 
the laboratory. The lower ratings of the 
instructor on clarity of presentation may 
have occurred because part of his function 
had been taken over by the laboratory 
assistant. 


SuMMARY 


1. The problem of whether or not fac- 
ulty time can be conserved through teach- 
ing in a large lecture class rather than in 
small sections and through replacing the 
instructor in the laboratory by an under- 
graduate student assistant was investi- 
gated by having the same instructor teach 
equated groups of typical students the 
same general education course in mathe- 
matics under two conditions: small lec- 
ture-discussion sections with a laboratory 
conducted by the instructor and a large 
lecture class with a laboratory conducted 
by a student assistant. 

2. On pre- and posttests on a test of 
relevant content and on an essay graded 
for understanding of mathematics, the two 
types of classes did not differ in amount 
of gain and both gained significantly and 
substantially. 

3. Although students in both types of 
courses rated the instructor and the course 
satisfactory, the lecture class was less sat- 
isfied than the sections. However, the 
comments in the lecture class indicated 
that the laboratory helped to meet student 
needs for discussion in which they could 
clarify the lecture for themselves. 


Received July 17, 1958. 
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During the past decade, a number of 
investigations have been made to discover 
the relationship of factors other than in- 
telligence and aptitude scores to students’ 
level of achievement. In general, students 
have been selected for such studies on the 
basis of discrepancies between their pre- 
dicted level of achievement, as determined 
by aptitude tests and other criteria, and 
their actual level of achievement. 

Overachieving college students have 
been described in one report as likely to 
have “less fortunate backgrounds.” Fac- 
tors such as social enjoyment and prestige 
were generally found to have influenced 
the underachievers’ decision to attend col- 
lege (7). Other investigators have found 
certain personality factors, e.g., tendencies 
toward maladjustment (1), superego sta- 
tus (11), and overconformity (10), to 
have some influence on level of achieve- 
ment. Studies of the effects of remedial 
reading programs on achievement suggest 
that such programs result in improved 
achievement for some students (8, 9). 
Instructor grades appear to have been the 
only measure of achievement used in the 
above studies. 

The purpose of the investigation to be 
discussed here was to discover some of the 
factors responsible for differences in 
achievement as measured by instructors’ 
ratings and by common departmental 
term-end examinations. It appeared that 
some students in the Basic College Gen- 
eral Education Program at Michigan State 
University quite consistently received a 


*This study was part of a doctoral dis- 
sertation completed in 1956 under the direc- 
tion of Paul L. Dressel and Walter F. 
Johnson, Jr. 


higher grade on their term-end examina- 
tion than they received from their instruc- 
tors in the Basic College courses, while 
other students seemed to be equally con- 
sistent in getting the higher of the two 
grades from their instructors. 

The curriculum in the Basic College 
embodies four comprehensive areas: Com- 
munication Skills, Natural Science, Social 
Science, and Humanities. Each of the 
basics consists of three courses taken in 
sequence, and both instructors and stu- 
dents are provided a common syllabus for 
each course. 

Students’ final grades in each of the 
Basie College courses are derived from in- 
structors’ ratings and performance on de- 
partmental term-end examinations, each 
of which counts 50 per cent in the deter- 
mination of the final grade? Prior to con- 
version to the final letter grade, both in- 
structor grades and term-end examination 
grades are assigned from a 15-point scale, 
with a score of one corresponding to F 
minus and a score of 15 corresponding to 
A plus. For each student who completes 
the Basic College Program, there is a 
record of 12 instructor grades, 12 term- 
end examination grades, and 12 final letter 
grades. The coefficient of correlation be- 
tween mean instructor grades and mean 
term-end examination grades of all Basic 
College students is generally approxi- 
mately .80. 


* Departmental term-end examinations are 
multiple-choice tests constructed by the 
Basic College Evaluation Services, a non- 
instructional department which helps de- 
velop, coordinate, and administer the pro- 
gram of examinations and evaluation, in 
conjunction with the various departments 


involved. 
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PROCEDURE 


Students whose instructor grades were 
generally higher than their term-end 
examination grades would appear to be 
characterized by traits which enhanced 
their performance in the structure of class- 
room activities and which commended 
them to their instructors. Thus, the hy- 
pothesis was presented that students who 
generally received the higher grade from 
their instructors were more insecure, com- 
pulsive, conforming, and rigid than stu- 
dents who generally received the higher 
grades on the term-end examinations. The 
Inventory of Beliefs test was used to test 
this hypothesis.’ This test consists of 120 
statements with directions requesting the 
student to respond to each item in terms 
of the following key: 1. strongly agree, 2. 
agree, 3. disagree, and 4. strongly disagree. 
Since all of the statements should elicit 
disagreement, low scores are obtained by 
individuals who are characterized in terms 
of the above hypothesis, with the opposite 
being true of students obtaining high 
seores (4). 

Basic College departmental term-end 
examinations are multiple-choice tests 
which are cumulative and increasingly 
comprehensive from term to term. These 
tests require a considerable amount of 
reading during the examination period. 
Thus, a second hypothesis presented was 
that the reading ability of students who 
generally received the higher grade from 
their instructors was inferior to that of 
their opposites. The Michigan State Uni- 
versity Reading Test, designed by mem- 
bers of the Basic College Evaluation 
Services, was used to test this hypothesis. 
This test yields a vocabulary score, a com- 
prehension score, and a total score. 


Students whose performance on the 


*Developed by the Intercollege Com- 
mittee on Attitudes, Values, and Personal 
Adjustment: The Cooperative Study of 
Evaluation in General Education of the 
American Council on Education. 


term-end examinations was consistently 
short of expectations evolving from their 
instructor ratings might well learn to an- 
ticipate the term-end examination as a 
threatening, anxiety-producing experience. 
The Taylor Anxiety Scale was used to test 
an hypothesis presented with respect to 
this problem, with the unsurprising result 
that anxiety thus measured was shown to 
be unrelated to the problem. Beier has 
demonstrated, however, that induced anx- 
iety can impair certain aspects of intellec- 
tual functioning, resulting in impaired 
performance on tests.‘ 

The assumption underlying these hy- 
potheses was that differences in general 
scholastic aptitude and intelligence were 
not related to the phenomenon to be stud- 
ied. Nevertheless, it seemed inappropriate 
completely to disregard these factors, and 
comparisons on ACE scores were also 
made. The ACE, like the MSU Reading 
Test, is taken by all entering freshmen and 
it was the scores that the students in the 
study made on these tests at the time of 
enrollment which were used for the in- 
vestigation. 

Three groups of students were selected 
for the investigation of the problem: (a) 
students whose instructor grades were 
generally higher than their term-end ex- 
amination grades (higher instructor grade 
group); (b) students whose term-end ex- 
amination grades were generally the higher 
(higher examination grade group); and 
(c) students whose instructor grades and 
term-end examination grades were gener- 
ally about the same (nondeviant grade 
group). The latter group was selected for 
purposes of comparison with both of the 
two groups above to determine if these 
two extreme groups were different from a 
nondeviant grade group as well as from 


*The ¢ technique was used in preference 
to the generally more appropriate analysis 
of variance for data of this type because of 
the investigator's interest in making pair- 
wise comparisons of the groups. 
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each other with respect to the factors 
studied. 

Statistical calculations for the selection 
of the above groups were based upon the 
instructor grades and term-end examina- 
tion grades obtained by populations of 
565 males and 469 females during their 
completion of the 12 Basie College courses. 
Other investigators have found that 
women tend to get higher grades than men 
from instructors, while men tend to get 
higher grades than women on standard 
achievement tests (2, 12). To avoid this 
bias, means of the accumulative sums of 
examination and instructor grades, mean 
differences between the accumulative 
sums, and standard deviations of the dif- 
ferences were computed separately for 
men and women. Men and women thus 
selected were jointly assigned to their ap- 
propriate groups. (Women received both 
higher instructor grades and higher ex- 
amination grades than men. While wom- 
en’s examination grades were only slightly 
higher than their instructor grades and 
only slightly higher than the men’s exami- 
nation grades, women’s instruesor grades 
were substantially higher than the men’s.) 
In order to limit the study to extreme 
cases, only those men and women were 
selected whose differences between their 
summed examination grades and summed 
instructor grades placed them at least two 
standard deviations beyond the mean dif- 
ference (E-I) between the accumulative 
sums of examination and instructor grades 
of the total male and female populations 
respectively. 

The above method of selection identified 
42 students as.gonsistently obtaining 
higher grades from instructors and 54 as 
consistently obtaining higher grades on 
the term-end examination. Of these num- 
bers, 29 students in the higher instructor 
grade group (14 males and 15 females) 
and 32 students in the higher examination 
grade group (20 males and 12 females) 
cooperated throughout the study. The 
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nondeviant grade group was comprised of 
32 students whose differences between in- 
structor grades and examination grades 
placed them within one third of one stand- 
ard deviation of the mean difference be- 
tween the accumulative sums of examina- 
tion and instructor grades. 

Members of the higher instructor grade 
group and higher examination grade group 
were interviewed prior to testing. Informa- 
tion gained from the interviews is dis- 
cussed below. 


RESULTS 


After a brief, standard description of the 
problem, students in the higher instructor 
grade and higher examination grade 
groups were asked if they knew which 
category described their performance. 
Only one interviewee (in the higher in- 
structor grade group) was unaware of the 
direction of her grades. Following the re- 
sponse to the above query, structuring of 
the interviews was restricted to the ques- 
tion: “How do you account for this?” 

In response to the above question, 15 
students in the higher instructor grade 
group expressed fear of the term-end ex- 
amination; 11 students in the higher in- 
structor grade group stated that too often 
information required on the term-end 
examination did not correspond to work 
covered in class; several labeled the ex- 
amination “too -ambiguous”; and some 
complained that both the tests as a whole 
and individual items were too long, re- 
quiring too much reading in the time al- 
lotted for the test. 

By contrast, the comments of 25 of the 
32 students in the higher examination 
grade group were interpreted to indicate 
a lack of motivation for and indifference 
toward the Basie College courses. Students 
in the higher examination grade group 
generally saw the disparity between their 
examination and instructor grades as a 
phenomenon of their own making, while 
their much more anxious opposites tended 
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to see their circumstance as a rather 
threatening problem which had eluded 
remedy. 

Mean grades presented in Table 1 be- 
low suggest, indeed, that students in the 
higher instructor grade group had some 
reason to feel threatened by the term-end 
examination. Their mean examination 
grade, about D plus, was, however, what 
one might expect of this group. In finding 
a group with higher instructor grades, one 
might expect to find their examination 
grades below average. Conversely, higher 
examination grades would seem to be as- 
sociated with below average instructor 
grades. Instead, we find the average in- 
structor grade to be the same for the two 
groups (C plus), and the mean examina- 
tion grade of the higher examination 
grade group considerably above average (B 
plus). The extremely high coefficients of 
correlation between mean instructor grades 
and mean examination grades seen in 
Table 1 are artifacts of the selection 
method. This artifact of selection mani- 
fests itself each time both mean instructor 
and mean examination grades are com- 
pared with a third variable. 


Tests of Significance of Differences 

In comparing the mean ACE scores, 
mean reading scores, and mean Inventory 
of Belief scores (see Table 2), considerable 
differences were found to exist between 
the higher instructor grade group and 
both of the other two groups. The mean 
ACE scores of both the higher examina- 
tion grade group and the nondeviant 
group were significantly higher than the 
mean ACE score of the higher instructor 
grade group..The difference between the 
mean ACE scores of the higher examina- 
tion grade group and the higher instructor 
grade group was significant beyond the 
001 level of confidence, while the differ- 
ence between mean ACE scores of the 
higher instructor grade group and the non- 
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TABLE 1 
MEAN INsTRUCTOR AND MEAN EXAMINATION 
Grapes, STANDARD DeviaTIONs, AND Co- 
EFFICIENTS OF CORRELATION BETWEEN 
Mean Instructor aNnD MEAN EXAMINA 


TION GRADES 


Mean 
Deviate Groups E SD 


Mean 
> I 
Grade 


Grade 


Higher I 6.81 1.29 9.15 
Grades 

Higher E 
Grades 

Non-Deviate 


Grades 


11.63 | 1.46 9.06 


9.36 | 1.46, 9.44 


deviant grade group was significant be- 
yond the .01 level of confidence” 

Superior reading ability set the group 
with the higher examination grades apart 
from the other two groups. The mean 
reading score of the higher examination 
grade group is significantly higher than 
that of the nondeviant grade group be- 
yond the .001 level of confidence, while 
the mean reading score of the latter group 
is significantly higher than that of the 
group with the higher instructor grades be- 
yond the .001 level of confidence. The 
very small variance among the reading 
scores of the higher instructor grade group 
is one of the striking features of this group. 

Mean Inventory of Belief scores re- 
vealed no difference between the higher 
examination grade group and‘ the non- 
deviant group, but, as the data in Table 2 
indicate, the mean Inventory of Belief 
scores of both these groups were found to 
be significantly higher than that of the 
higher instructor grade group beyond the 
001 level of confidence. The group getting 
higher instructor grades was thus char- 
acterized as being more compulsive, in- 
secure, rigid, and conforming. 


Correlation Analysis 


Estimates of the relationship of stu- 
dents’ ACE scores, reading scores, and 
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TABLE 2 
MEAN Test Scores, VARIANCES, AND TESTS OF SIGNIFICANCE OF DIFFERENCES BETWEEN 
Eacu or THE THREE GROUPS 


Tests/All Groups 


Mean 





ACE 
Higher E Grades 
Higher I Grades 
Nondev. Grades 


MSU Reading Test 
Higher E Grades 57. 
Higher I Grades 
Nondev. Grades * 47.8 


Inventory of Beliefs 
Higher E Grades 


79.29 


Higher I Grades 66.17 
78.34 


Nondev. Grades 


t 

Variance N € le ; HP 7 

- . Nondev. irE 
Hi’r I Gr. Gr. bb 


302.06 
346.17 
447.40 


154.10 
59.37 
144.34 


203.11 
304.49 
116.91 





* Significant beyond the .01 level of confidence. 
** Significant beyond the .001 level of confidence. 


TABLE 3 


CoRRELATION COEFFICIENTS SHOWING RELATIONSHIP BETWEEN StupENTs’ Test Scores 
AND MEAN EXAMINATION AND INstTRUCTOR GRADES 








Variables Correlated 


Trois 
Higher 


Grades 





Non- 
deviate 
Grades 


Higher 
E 
Grades 





ACE & X E Grades 
ACE & X I Grades 
ACE Scores for All Grps and (E — I) 


Reading & X E Grades 
Reading & X I! Grades 
Reading Scores for All Grps and (E — I) 


IB & X E Grades 
IB & X I Grades 
IB Scores for All Groups and (E — I) 


35 
37 


77 25 
86 27 


55 
55 


57 
61 


45 
45 


15 
.22 


00 
05 


—.10 
—.15 


52 





® Triserial correlation coefficients showing relationships of test scores of all groups to differences between examina- 


tion and instructor grades (E — I). 


Inventory of Belief scores to their mean 
instructor and mean examination grades 
(see Table 3) resulted in coefficients of 
correlation which, in general, require little 
comment. 

The magnitudes of the coefficients ob- 


tained in estimating the relationship of 
ACE scores to mean examination and 
mean instructor grades for the higher in- 
structor grade group, .77 and .86 respec- 
tively, are considerably greater than those 
obtained for the other two groups and 





INSTRUCTOR VERSUS EXAMINATION GRADES 


greater, too, than customarily found. 
Within this group, apparently, both the ex- 
amination and the instructor ranked the 
students quite consistently in relation to 
ability. There are personal qualities at 
work, however, which seem to commend 
the student to the instructor, resulting in 
higher grades from instructors. 

The values obtained in estimating the 
relationship of reading scores and Inven- 
tory of Belief scores to mean instructor 
and mean examination grades are similar 
to those values usually found in using these 
instruments. The Inventory of Beliefs 
typically yields a rather wide range of 
scores. 

Analysis of the relationship of the 
differences between students’ examination 
and instructor grades to test scores re- 
vealed rather substantial evidence that for 
these groups higher aptitude scores, read- 
ing scores, and Inventory of Belief scores 
were positively related to tendencies to 
get the higher grade on the term-end ex- 
amination. Jaspen’s formula for triserial 
correlation was used to determine the rela- 
tionship of difference between examination 
and instructor grades to test scores (5). 
A triserial coefficient of 45 was obtained 
in estimating the relationship of difference 
between examination and instructor grades 
to ACE scores; a coefficient of .68 was ob- 
tained in correlating reading scores with 
these grade differences; and a coefficient 
of 52 was obtained in estimating the re- 
lationship of Inventory of Belief scores to 
differences between examination and in- 
structor grades. 


Discussion 


The mean instructor and examination 
grades of the higher instructor grade group 
were in keeping with the investigator's 
expectations, i.e., average grades from in- 
structors and below average grades on the 
term-end examination. Expectations of the 
converse for the higher examination grade 
group were not supported by the results of 


the study. This group’s very high mean 
ACE score and reading score also suggest 
the unlikelihood of finding many of the 
expected variety in the group. 

Considered from the point of view of 
ability to achieve, the evidence suggests 
that the higher instructor grade group 
received higher grades from their instruc- 
tors than they should have, while the 
higher examination grade group received 
lower instructor grades than they should 
have. The superiority of the nondeviant 
grade group to the higher instructor grade 
group in general aptitude and reading 
ability also raises a question about the 
similarity of the instructor grades of these 
two groups. Again, the evidence seems to 
force the conclusion that students who 
were characterized as being more conform- 
ing, compulsive, rigid, and insecure re- 
ceived higher grades from their instructors 
than would be expected of them on the 
basis of ability alone. The information ob- 
tained in interviews suggests that the 
average instructor grades obtained by the 
higher examination group must be ex- 
plained in terms of a lack of motivation 
for and indifference toward the Basic 
College courses. 

No thought was given to determining 
which of these two groups, higher instruc- 
tor grade group or higher examination 
grade group, were the overachievers and 
which the underachievers. With respect to 
instructor grades, the higher examination 
grade group could be called under- 
achievers; but, in general, they did make 
high term-end examination grades, ths 
demonstrating a high degree of mastery of 
their subjects. Conversely, the higher in- 
structor grade group could be called over- 
achievers because their instructor grades 
appeared to be higher than warranted by 
their ability; but this group’s examination 
grades do not indicate mastery of the sub- 
ject, or overachievement. The results of 
this study suggest that what has often 
been called over or underachievement may 
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in some cases have been a function of the 
method of measuring achievement. In such 
cases a student’s grades might not be an 
accurate description of his relative mastery 
of the subject. 


SuMMARY 


The purpose of this investigation was to 
discover some of the factors which differ- 
entiate students whose instructor grades 
were consistently higher than their grades 
on departmental term-end examinations 
from students who consistently got the 
higher grades on their departmental term- 
end examinations. Students who consist- 
ently received the higher grade from in- 
structors were found to receive average 
instructor grades and below average grades 
on the term-end examinations, while stu- 
dents who consistently received the higher 
grade on the term-end examinations were 
found to have superior examination grades 
and average instructor grades. Aptitude 
scores and reading scores of students who 
received the higher grades from instruc- 
tors were found to be significantly lower, 
beyond the .001 level of confidence, than 
those of their opposites in achievement. 
The Inventory of Beliefs test character- 
ized the higher instructor grade group as 
being more compulsive, conforming, rigid, 
and generally insecure than their oppo- 
sites. 
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